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ABSTRACT 


The  Bicentennial  Conference  on  Mathematical  Programming,  held  in  Gaithersburg  on 
November  29-December  1,  1976,  examined  the  relationship  between  mathematical 
programming  and  the  computer.  The  more  than  50  papers  and  panel  discussions 
exhibited  this  theme  in  terms  of  the  design  for,  use  of,  implementation  of,  and  implica- 
tions for  mathematical  programming  software  and  computations.  Particular  emphasis 
was  placed  on  bringing  out  computer-oriented  subject  matter  not  ordinarily  presented 
in  a  mathematical  programming  context.  These  resulting  Proceedings  document  this 
Conference,  which  was  jointly  sponsored  by  SIGMAP  of  the  ACM  and  by  the  Applied 
Mathematics  Division  of  the  Institute  for  Basic  Standards  of  NBS. 
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PREFACE 


First  I  wish  to  thank  the  Bicentennial  Conference  Committee  and  our  host.  The 
National  Bureau  of  Standards,  for  their  efforts  in  making  this  conference  so  very 
successful.  These  proceedings  record  the  presentations,  but  it  is  not  possible  to  include 
the  many  interesting  discussions  that  took  place  among  the  leading  researchers  as  well 
as  newcomers. 

Special  notice  should  be  taken  of  subjects  not  previously  considered  in  confer- 
ences on  mathematical  programming.  In  particular,  two  subjects  are:  (1)  interfaces 
with  the  computing  environment  (hardware  and  software),  and  (2)  database  manage- 
ment (including  matrix  generation/report  writing).  Bill  White  is  to  be  congratulated  in 
developing  a  program  that  really  spans  the  mainstreams  of  mathematical  programming. 
Further,  he  did  so  by  inviting  the  leaders  in  these  subjects  and  helping  them  to  form 
well  organized  sessions. 

Finally,  and  by  no  means  least,  I  extend  special  thanks  to  our  plenary  speakers, 
George  Dantzig  and  William  Orchard-Hays.  It  is  certainly  appropriate  for  these 
founders  of  the  field  of  Mathematical  Programming  Systems  (MPS)  to  have  provided 
the  foundation  for  the  rest  of  the  program. 


Harvey  J.  Greenberg 
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FOREWORD 


The  Bicentennial  Conference  on  Mathematical  Programming  was  held  on  November  29  - 
December  1,  1976,  at  the  National  Bureau  of  Standards,  Gaithersburg,  Maryland.  This  Conference 
was  jointly  sponsored  by  the  Special  Interest  Group  on  Mathematical  Programming  (SIGMAP)  of  the 
Association  of  Computing  Machinery,  and  by  the  Applied  Mathematics  Division  of  the  Institute  for 
Basic  Standards,  National  Bureau  of  Standards,  U.  S.  Department  of  Commerce. 

The  basic  theme  of  the  Conference  was  the  relationship  between  mathematical  programming  and 
the  computer,  and  the  presented  papers  and  panel  discussions  exhibited  this  theme  in  terms  of  the 
design  for,  use  of,  implementation  of,  and  implications  for  mathematical  programming  software  and 
computations.  Particular  emphasis  was  placed  on  bringing  out  computer-oriented  subject  matter  not 
ordinarily  presented  in  a  mathematical  programming  context.  Both  contributed  and  invited  papers 
were  presented,  with  the  contributed  papers  being  refereed:  of  54  abstracts  received,  47  full  papers 
were  submitted,  and  30  were  selected  for  presentation  by  the  Session  Chairmen,  together  with  the 
Program  Chairman,  after  the  refereeing  process. 

The  Session  Chairmen  functioned  as  an  extended  program  committee,  and,  in  consultation  with 
the  Program  Chairman,  really  assembled  their  respective  sessions.  Special  thanks  are  due  them  for 
their  contribution  to  the  technical  content  of  the  Conference.  And  thanks  go  to  the  referees:  G. 
Bennington,  J.  P.  Blondeau,  L.  Bodin,  B.  Buzby,  L.  Cooper,  J.  Cord,  R.  Cottle,  R.  Davis,  R.  Dembo, 
W.  Drews,  J.  Dyer,  J.  Eddington,  F.  Fiala,  S.  Fromovitz,  D.  Gay,  M.  Gutterman,  M.  Harrison,  G. 
Hefley,  D.  Himmelblau,  H.  Hoc,  R.  Jeroslow,  D.  Klingman,  T.  Knowles,  G.  Kochenberger,  J.  Kowalik, 
C.  Krabek,  M.  Lenard,  R.  Marsten,  C.  McCallum,  R.  Meyer,  M.  Minkoff,  J.  Mulvey,  R.  O'Neill,  T. 
Prabahakar,  L.  Pyle,  A.  Ravindran,  H.  Salkin,  L.  Schrage,  M.  Smith,  K.  Spielberg,  R.  Stark,  A.  Waren, ' 
C.  White,  and  J.  Whiton. 

The  Conference  Committee  did  an  outstanding  job.  Under  the  expert  guidance  of  Harvey 
Greenberg,  the  administrative  chores  were  accomplished  so  as  to  permit  the  technical  program  to  be 
developed  with  a  minimum  of  impact  from  secondary  sources.  In  particular,  Charles  Mylander  and 
Larry  Haverly  (and  his  alter  ego,  Joyce  Draper)  did  yeoman's  work  on  their  respective  duties.  Thel 
advice  from  Saul  Gass  was  especially  welcome,  particularly  when  the  Conference  was  in  its  formative 
stages.  ^  J 

The  National  Bureau  of  Standards  was  a  most  appropriate  location  for  this  Conference,  having ) 
been  a  center  of  activity  in  computation  and  optimization  for  over  a  quarter  of  a  century.  And,  the 
warm  welcome  given  by  B.  H.  Colvin,  Chief  of  the  Applied  Mathematics  Division,  and  by  A.  M 
McCoubrey,  Director  of  the  Institute  for  Basic  Standards,  was  reflected  as  well  in  the  excellen. 
conference  facilities  and  logistics  provided  by  NBS.  Thanks  and  appreciation  go  to  NBS,  ano 
especially  to  Bill  Hall,  the  NBS  representative,  and  to  Sarah  Torrence,  of  the  NBS  staff. 


William  W.  White 
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REMARKS  ON  THE  OCCASION  OF  THE 
BICENTENNIAL  CONFERENCE  ON  MATHEMATICAL  PROGRAMMING 
THE  EARLY  ROLE  OF  N.B.S. 


George  B.  Dantzig 
Stanford  University 


It's  a  great  pleasur 
I'm    very    glad  that 
Director      of     the  In 
Standards       of     the  N 
Standards ,     in    his  in 
told    us  for    whom  NBS 
consumer,  industry, 
community,     and  educat 
It  is  nice  to  learn  th 
government  truly  work 
of  us  had    come  to  beli 
other  way  around. 


e  to  be  here  today. 
Dr.  McCoubrey,  the 
stitute  for  Basic 
ational  Bureau  of 
troductory  remarks , 
worked,  namely  the 
the  scientific 
ional  institutions, 
at  some  parts  of  the 
for  us  because  some 
eve  that     it  was  the 


Dr.  McCoubrey  mentioned  the  early  days 
of  linear  programming  and  the  cooperative 
role  played  by  the  Bureau  with  Air  Force 
Project  SCOOP.  I  would  like  to  make  this 
my  theme  today . 

As  some  of  you  may  recall,  I  was  the 
mathematical  advisor  to  the  Air  Force 
Comptroller  at  the  time  that  linear 
programming  was  born.  Thinking  back  to  the 
early  days,  I  have  discovered  a  remarkable 
coincidence  that  I  would  like  to  pass  on  to 
you  between  the  date  chosen  for  this 
commemorative  conference,  November  29, 
1976,  and  the  date  when  linear  programming 
began.  Preliminary  flirtations  with  the 
idea  started  in  the  fall  of  1946.  By 
November  29,  1946,  exactly  30  years  ago, 
linear  programming  was  conceived. 

During  the  war  I  took  part  in  the 
planning  activities  of  the  Air  Force.  In 
the  immediate  postwar  period,  I  was  in  the 
throes  of  trying  to  decide  whether  to  stay 
with  government,  begin  an  academic  career, 
or  do  research  for  industry.  I  couldn't 
make  up  my  mind.     There  were  some  people  in 


Edited  from  a  transcription  taken  at  the 
Conference.  For  additional  background 
information  and  more  detailed  references  on 
SCOOP  and  early  mathematical  programming 
influences  and  activities,  see  Chapter  2  of 
G.  B.  Pantzig,  LINEAR  PROGRAMMING  AND 
EXTENSIONS,  Princeton  University  Press, 
1963. 


the  Input-Output 
(for     which  he 
in  1973)    could  be 
In  the     winter  of 


the  Air  Force  (particularly  Dal  Hitchcock 
and  Marshall  Wood)  who  were  very  keen  on 
having  me  stay.  As  bait  they  suggested 
that  I  try  to  the  mechanize  of  the  planning 
process.  This  challenge  intrigued  me.  In 
the  fall  of  '46,  I  toyed  with  different 
approaches.  By  November  29,  1946,  the  idea 
came  that  perhaps 
technique  of  Leontief 
received  the  Nobel  Prize 
suitably  generalized. 
'46,  my  work  began  in  earnest;  by  June  of 
'47,  the  linear  programming  model  as  we 
know  it  today  was  well  along. 

Our  research  from  early  1947  on  was 
influenced  by  a  conference  arranged  by 
Aiken  at  Harvard.  It  was  here  that 
Marshall  Wood  and  I  became  first  exposed  to 
the  idea  of  an  electronic  computer.  It  was 
a  wonderful  conference.  Although  it  was 
only  a  gleam  in  the  eyes  of  the  speakers, 
they  talked  about  electronic  computers  as 
if  they  really  existed  and  indeed  with 
capabilities  very  much  as  they  have  today. 
I  was  overwhelmed  with  the  potential  of 
this  new  tool.  To  appreciate  what  happened 
in  the  early  stages  of  linear  programming, 
it's  well  to  remember  that  we  believed  that 
computers  would  become  practical  within  a 
year  or  two.     We  acted  accordingly. 

It  would  be  interesting  to  speculate  how 
many  important  developments  might  never 
have  happened  if  one  knew  that  it  would 
take  almost  two  decades  for  powerful 
computers  to  become  a  practical  reality. 

From  the  beginning,  it  was  this  gleam  in 
the  eye  of  the  designers  that  fast, 
practical  computers  really  would  soon 
become  available  that  motivated 

computational  development  of  linear 
programming.  It  resulted  in  the  Air  Force 
decision  to  mechanize  the  planning 
process.  The  name  for  the  effort  was  Air 
Force  Project  SCOOP,  standing  for 
Scientific  Computation  of  Optimum  Programs. 
The  Office  of  Naval  Research,  particularly 
Dr.  Mina  Reese,  who  is  well  known  to  many 
of  you,  played  an  important  role.  She 
introduced     us     to    John     Curtiss     and  his 
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mathematics  group  at  the  Bureau.  Her 
office  subsidized  related  research. 

Our  group  was  not  technically  equipped 
to  supervise  or  evaluate  the  development  of 
computers.  We  turned  to  the  Bureau  of 
Standards  to  serve  as  our  technical  agents, 
to  keep  us  informed  about  computer 
developments.  And  so  it  came  about  that  I 
became  one  of  the  behind-the-scenes 
sponsors  of  the  early  development  of 
computers.  The  Air  Force  Comptroller 
transferred  huge  sums  of  money  to  the  NBS 
for  this  purpose.  There  were  close 
contacts  with  Sam  Alexander  whose  group  at 
NBS  built  the  SEAC .  Air  Force  money  helped 
NBS  fund  the  building  of  BINAC,  UNIVAC,  and 
also  some  IBM  component  research.  I  don't 
claim  that  SCOOP  was  the  only  sponsor 
(directly  or  indirectly)  in  this  field; 
there  were  others ,  for  example  the  Bureau 
of  the  Census;  nor  did  SCOOP  sponsor  the 
building  of  SWAC  under  Harry  Huskey  at  the 
Bureau's  Institute  of  Numerical  Analysis  at 
UCLA, 

I  am  particularly  grateful  for  the 
advice  of  Albert  Cahn  who  worked  for 
Curtiss.  He  recommended  two  key  people 
whom  he  said  I  should  consult  regarding  the 
relation  of  linear  programming  to 
economics,  mathematics,  and  numerical 
analysis:  the  first  was  the  economist, 
Tj ailing  Koopmans;  the  other,  the  famous 
mathematician,  Johnny  Von  Neumann.  In  June 
of  1947,  I  went  to  the  Cowles  Foundation  at 
the  University  of  Chicago  to  see  Koopmans. 
This  contact  initiated  his  interest  and 
soon  the  interest  of  other  young  economists 
(many  now  well  known)  in  the  relationship 
between  mathematical  programming  and 
economics.  In  1975  Koopmans  received  the 
Nobel  Prize  for  his  contributions  to  the 
theory  of  resource  allocation.  In  the  fall 
of  1947,  I  went  with  Curtiss  to  Princeton 
to  see  Von  Neumann  at  the  Institute  of 
Advanced  Study.  In  the  course  of  our 
discusssions ,  Von  Neumann  stated  the 
duality  theorem,  related  it  to  game  theory, 
and  made  other  observations  that  laid  the 
mathematical  foundations.  (The  history  of 
the  duality  concept  makes  an  interesting 
story  in  itself.) 

In  June  of  1948,  John  Curtiss  again 
introduced  me,  this  time  to  his 
brother-in-law,  Al  Tucker  at  Princeton. 
Soon  thereafter  Tucker  with  his  students 
Harold  Kuhn  and  David  Gale  started  a 
seminar  which  resulted  in  their  well  known 
contributions  to  duality  theory,  game 
theory  and  nonlinear  programming.  Von 
Neuman  and  Tucker  spearheaded  the  interest 
of  mathematicians. 

While  this  academic  interest  was 
growing,  work  began  in  earnest  within  the 
Air  Force  to  mechanize  the  planning 
process.  During  the  period  1947-52  there 
were  two  main  branches  to  our  efforts  -- 
one  practical,  the  other  theoretical.  The 
goal     of      the     practical     branch      was  to 


implement  quickly  and  to  solve  routinely 
very,  very  large  dynamic  programs  required 
for  planning.  These  systems  were  so  large 
that,  even  by  today's  standards,  they  would 
be  beyond  anyone's  capability  to  optimize 
as  a  single  linear  program.  For  practical 
planning  Marshall  VJood  and  I  invented  a 
hierarchical  stepwise  optimization  scheme 
called  a  "triangular  model".  We  echelonned 
the  activities  of  the  Air  Force  in 
"parasitic"  order  --  namely  on  the  top  rung 
were  the  combat  units  (which  took  from 
everybody  but  gave  nothing  in  return) ;  next 
below  them  were  the  support  units  in  the 
combat  zone  (which  took  from  everybody 
below  them  but  gave  nothing  in  return) ;  and 
so  forth  down  to  support  units  that 
recruited  and  trained  personnel  and  the 
units  which  bought  and  distributed 
supplies.  Although  called  a  triangular 
model  it  could  be  applied  to  general 
matrices  by  first  making  a  triangular 
approximation  and  then  making  iterative 
corrections  by  adding  on  additional  lower 
rungs  in  the  hierarchical  ordering.  Murray 
Geisler's  group  formulated  Air  Force 
planning  problems  in  this  manner.  Sted 
Nobel  and  others  extended  the  formulation 
to  sectors  of  the  national  economy. 

The  triangular  model  scheme  was 
implemented  computationally  in  the  spring 
of  1949  by  Mike  Montalbano  of  the  National 
Bureau  of  Standards.  On  the  plane  coming 
to  this  conference  I  ran  into  Mike.  We 
reminisced  about  his  achievement.  Mike  did 
a  very  remarkable  thing:  he  used  IBM 
punched  card  equipment,  the  only  equipment 
available,  in  a  way  that  no  one  had  ever 
used  it  before.  To  iteratively  carry  out 
the  programmed  steps,  he  processed  the 
cards  through  a  sequence  of  machines: 
tabulators,  sorters,  reproducers, 

collaters ,  and  IBM  604 's,  all  arranged  in  a 
great  big  circle.  Because  the  equipment 
was  unreliable,  he  also  processed  the  cards 
through  two  pieces  of  similar  equipment 
serially  in  order  to  have  one  machine  check 
the  computations  of  the  other. 

One  day  during  the  development  of  the 
computational  system  by  Montalbano,  Wassily 
Leontief  visited  Washington.  I  asked 
Wassily  if  he  would  like  to  come  on  over  to 
the  Bureau  of  Standards  to  take  a  look 
(they  had  a  downtown  office  where  tests 
were  being  run) .  It  was  a  Sunday,  and  Mike 
was  pleased  to  put  on  a  demonstration  of 
just  how  wonderful  his  system  was.  He 
pointed  out  that  the  equipment  was 
unreliable  and  how  he  had  to  get  after  IBM 
regularly  to  make  it  work.  He  pointed  out 
the  features  of  his  system  for  checking 
errors.  "For  example",  he  said,  "See  those 
two  reproducers,  one  following  the  other? 
If  the  holes  punched  by  the  second  do  not 
match  exactly  those  punched  by  the  first, 
the  second  reproducer  will  stop,  turn  on 
red  lights,  and  we  will  know  that  the  last 
card  processed  is  wrong." 

As     the  cards     were  running     through  the 
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second  reproducer,  Mike  said,  "You  see,  at 
this  point  everything  is  working  fine, 
there  are  no  red  lights." 

But  on  the  side  to  me  he  said,  "Isn't  it 
making  a  loud  crunching  noise?" 

As  he  pulled  the  cards  from  the  output 
hopper,  we  saw  that  the  columns  were  laced 
full  of  holes  --  indeed  they  were  nothing 
but  confetti.  The  cards  were  clearly 
overpunched,  but  no  red  lights  had  turned 
on.  Leontief  tapped  me  on  the  shoulder  and 
said,  with  his  Russian  accent,  "There, 
there,  I  understand  these  things.  Don't 
vorry.  A  machine  is  like  a  voman,  very 
temperamental."  This  story  is  a  classic 
example  of  System  Antics  the  science  of 

why  complex  systems  work  poorly  if  at  all. 
Something  unforeseen  always  happens.  In 
this  case  someone  the  night  before  had 
changed  the  standard  positions  of  the 
control  brushes  on  the  reproducer  and 
forgot  to  change  them  back. 

Turning  now  to  the  theoretical  branch, 
namely  research  on  the  linear  programming 
model,  here  also  the  Air  Force  turned  to 
the  Bureau  for  help.  Jack  Laderman  and  his 
group  at  the  Mathematical  Tables  Project  in 
New  York,  using  hand  calculators,  solved 
Stigler's  nutrition  problem  using  the 
Simplex  Method  (December  1948)  .  This  was 
the  first  real  test  of  the  method.  To  do 
the  computations,  Laderman  handed  out  small 
worksheets.  These  completed  worksheets 
were  pasted  together  to  form  a  huge  "table 
cloth".  In  going  through  some  old 
correspondence  recently,  I  found  a  letter 
from  Oskar  Morgenstern  --  he  wanted  to  come 
down  to  Washington  to  see  the  famous  table 
cloth.     I  wonder  whatever  happened  to  it? 


In  early  1950,  Montalbano  with  Corky 
Diehm  wrote  the  first  Simplex  Code  and  made 
successful  runs  with  it  on  the  SEAC 
computer . 

There  were  others  at  the  Bureau  well 
known  to  you  who  contributed  in  a  major  way 
to  the  early  development.  Alex  Orden,  who 
is  here  today,  was  with  the  Bureau  for  a 
brief  period  before  he  joined  our  Air  Force 
group.  There  was  Ted  Motzkin,  Alan 
Hoffman,  and  Leon  Gainen.  George  Suzuki, 
now  with  the  Bureau,  was  with  us  at  SCOOP, 
as  was  Joe  Natrella,  whose  wife  Mary  is 
still  at  the  Bureau.  During  the  early 
1950 's,  Hoffman,  Mannos ,  Sokolowsky  and 
Wiegman  ran  comparative  tests  of  the 
Simplex  Method  with  alternative  algorithms. 
At  I.N. A.,  Motzkin,  Raiffa,  Thompson,  and 
Thrall  developed  the  Double  Description 
Method;  Motzkin  and  Shoenberg  developed 
the  Relaxation  Method;  and  C.  B.  Tompkins, 
the  Projection  Method. 

Those  were  the  days  to  remember!  I, 
personally,  am    grateful  to  the     Bureau  for 

(1)  their  pioneering  the  construction  of 
computers  and  their  sponsorship  of  others 
in  the  field;  (2)  their  design  of  the  first 
linear  programming  software;  (3)  their 
contacts  with  economists  and  mathematicians 
which  brought  about  early  interest  by  the 
academic  community;  (4)  their  sponsorship 
of     symposia  on    mathematical  programming; 

(5)  their  research  on  mathematical 
programming  theory,  algorithms,  and  their 
making  comparative        tests.  These 

contributions  played  a  major  role  in  the 
early  rapid  development  of  the  mathematical 
programming  field. 
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ENERGY  MODELS  AND  LARGE-SCALE  SYSTEMS  OPTIMIZATION* 


George  B.  Dantzig  and  Shailendra  C.  Parikh 
Stanford  University 


The  optimization  of  large-scale  dynamic 
systems  represents  a  central  area  of  research 
whose  successful  outcome  could  make  important  con- 
tributions to  the  analysis  of  crucial  national  and 
world  problems.    Although  a  great  number  of  papers 
have  been  published  on  the  theory  of  solving  large- 
scale  systems,  not  much  in  software  exists  that 
can  successfully  solve  such  systems.     We  believe 
that  there  has  been  little  progress,  because  there 
has  been  little  in  the  way  of  extensive  experimen- 
tation comparing  methods  under  laboratory-like  con- 
ditions on  representative  models.    At  Stanford's 
Systems  Optimization  Laboratory  (SOL),  to  bridge 
this  gap  between  theory  and  application,  we 

(1)  develop  experimental  softv/are  for  solving 
large-scale  dynamic  systems, 

(2)  systematically  compare  proposed  techniques  on 
representative  models, 

(3)  record  and  disseminate  information  regarding 
experimental  results. 


PILOT  Energy-Economic  Model 

Dynamic  models  that  describe  the  inter- 
actions between  the  energy  sector  and  the  general 
economy  help  in  providing  a  focus  to  our  research 
in  experimenting  with  large-scale  optimization 
models.     Models  of  this  type  are  under  development 
by  a  number  of  groups  to  study  the  energy  crisis 
and  a  probable  future  crisis  in  the  area  of  food 
(agriculture).     Developers  of  these  models  could 
make  effective  use  of  techniques  for  solving 
large-scale  systems,   if  they  were  available. 

Today's  policy  makers  at  industrial, 
governmental  and  international  levels  are  faced 
with  the  decisions  on  providing  the  needed  energy 
in  the  years  to  come  at  acceptable  social  cost. 
Such  decisions  must  take  into  account  many  com- 
plex interactions  related  to  the  technology  of 

Research  of  this  paper  was  partially  supported 
by  the  Office  of  Naval  Research  Contracts 
NOOOli+-75-C-0267,  NOOOII+-75-C-O865;  U.S.  Energy 
Research  and  Development  Administration  Contract 
E(o1+-31-526  pa  #18;  the  National  Science  Founda- 
tion Grants  MCS71-033^1  AOi+,  DCR75-Oi+5'^'+;  the 
Electric  Power  Research  Institute  Contract 
RP  652-1 ;  and  the  Stanford  Institute  for  Energy 
Studies  at  Stanford  University. 


energy  supply,  environmental  side  effects,  energy 
resource  conservation,  etc.,  as  well  as  the  national 
welfare  considerations  of  unemployment,  inflation 
and  living  standards. 

Some  of  the  important  questions  that  must 
be  considered  in  detail  in  the  formulation  of  the 
energy  policy  (or  policies,)  are  the  following: 

(1)  Are  we  using  up  our  cheap  energy  resources  too 
quickly? 

(2)  Are  we  making  sufficient  investments  now  so 
that  new  energy  technologies  will  come  into 
commercial  operation  when  needed  in  future 
years? 

(3)  Do  we  have  sufficient  physical  capacity  to 
build  the  required  new  plant  and  equipment  in 
the  energy  and  nonenergy  sectors  without 
seriously  hampering  growth  in  consumer  con- 
sumption, or  will  some  sacrifice  in  consump- 
tion be  necessary? 

{h)    What  are  the  various  energy  options  under 

different  patterns  of  crude  oil  import  price 
realizations? 

(5)  What  will  the  short  and  long  term  impact  be 
if  oil  and  gas  discoveries  are  less  than  pre- 
dicted? 

(6)  Can  we  find  an  energy  policy  that  is  robust, 
i.e.,  one  which  hedges  against  various  con- 
tingencies? 

It  is  our  belief  that  dynamic  mathematical  pro- 
gramming models  can,  at  the  very  least,  provide 
analysis  and  information  on  these  and  other  ques- 
tions and  can  substantially  improve  the  understand- 
ing of  the  interactions  that  must  be  considered. 
Such  models  have  been  developed  at  IIASA,  at  the 
Electricite  de  France  and  by  various  groups  in  the 
United  States. 

In  the  Systems  Optimization  Laboratory  at 
Stanford  we  have  under  development  a  linear  pro- 
gramming model  for  assessing  energy-economic  op- 
tions in  the  United  States,  called  PILOT  [7].  It 
spans  a  wide  spectrum  of  activities  of  the  economy, 
from  exploration  and  extraction  of  raw  energy  to 
industrial  production  and  consumer  demands  for  all 
goods  and  services.     The  data  requirements  there- 
fore cut  across  many  different  sources — consumer 
surveys,  import/export  and  trade  balance  data, 
manufacturers  surveys,  mining  data,  input /output 
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and  capital  coefficients,  energy  consumption  and 
substitution  data,  energy  technology  data  from 
Brookhaven  National  Laboratory  and  oil  and  gas 
exploration  and  production  data.     Hence,  there  is 
a  nontrivial  problem  of  achieving  consistency  and 
of  selecting  a  meaningful  level  of  detail  so  that 
the  model  stands  as  a  whole  rather  than  as  a  con- 
glomeration of  parts  that  could  collapse  under 
careful  analysis.     We  believe  that  the  initial  ver- 
sion of  the  model  being  built  will  meet  this  test. 

The  PILOT  model  is  a  statement  in  physical 
flow  terms,  to  the  extent  possible,  of  the  broad 
technological  interactions  within  and  across  the 
sectors  of  the  economy,  including,  but  in  greater 
detail,  the  energy  sector.    A  typical  run  of  the 
model  describes  what  the  country  could  achieve  in 
physical  terms  over  the  long  term,  say  ko  years. 

A  preliminary  version  of  the  model  has 
been  completed  and  several  useful  scenarios  have 
been  run  [l6].     In  1977,   improved  versions  of  the 
model  will  be  developed,  with  more  detail  regard- 
ing exploration  and  extraction  by  regions,  more 
detailed  modeling  of  various  conversion  processes, 
better  representation  of  foreign  trade,  substitu- 
tion, financial  flows  and  the  effect  of  prices  on 
demand  and  production. 

The  initial  version  of  the  PILOT  model  is 
an  eight  period,  year  model  which  has  approxi- 
mately 800  constraints  and  2000  variables. 

The  model  is  a  description  in  input/output 
terms  of  the  industrial  processes  of  the  economy 
and  the  demands  for  consumption,  capacity  forma- 
tion, government  services  and  net  exports.  The 
description  of  the  processes  that  provide  useful 
energy  to  the  economy  constitutes  the  detailed 
energy  submodel.     This  consists  of  technological 
input/output  descriptions  of  the  raw  energy  ex- 
traction and  the  energy  conversion  processes  as 
well  as  the  energy  import  and  export  activities. 
Four  linkages  interconnect  the  energy  sector  and 
the  rest  of  the  economy:     energy  demands  of  the 
economy,  bill  of  goods  needed  for  energy  process- 
ing and  capacity  expansion,  total  manpower  avail- 
able to  all  sectors  (including  energy)  and  a  trade 
balance  constraint  which  requires  the  equating  of 
total  exports  to  total  imports  when  these  items 
are  evaluated  in  I967  dollars  over  successive 
five  year  periods.     See  Figure  1,  which  shows  the 
major  blocks  of  coefficients  in  a  time  period, 
and  its  link  to  the  next  time  period  show  below 
and  to  the  left  of  the  dotted  lines. 

As  noted  earlier,  the  equations  of  the 
model  express  the  balances  of  various  physical 
flows.     For  the  energy  sector,  the  balances  of 
coal,  oil,  gas,   etc.  are  each  expressed  in  BTU 
units.     For  the  economy,  the  units  are  I967 
dollars,  which  are  obtained  by  weighting  the 
underlying  physical  flov/s  of  goods  and  services 
(assumed  to  be  in  fixed  proportions)  by  I967  prices 
per  unit. 

The  industrial  sectors  of  the  economy  are 
represented  by  a  23-order  input/output  matrix.  The 
sectors  are  grouped  as  follows:  5  energy  sectors, 
1  agriculture,  1  nonenergy  mining,   5  energy  inten- 
sive manufacturing,  h  energy  nonintensive  manufac- 
turing, h  services  and  3  capital  formation. 


Consumption  is  modeled  in  terms  of  the  con- 
sumption vector  of  the  average  consumer.  This 
sector  does  not  have  a  fixed  bill  of  goods  per 
capita  but  varies  as  a  function  of  a  parameter 
representing  the  real  consumption  income  attained 
per  capita.     Based  on  an  analysis  of  historical 
data,   consumption  of  any  item  as  function  of  aver- 
age consumption  income  is  nearly  linear  [2]. 

Capacities  for  each  of  the  I8  nonenergy 
sectors  and  all  of  the  energy  processes  are  differ- 
entiated from  one  another.     The  heterogeneous  cap- 
ital equipment  of  the  nonenergy  sectors  is  depre- 
ciated, whereas  the  energy  facility  capacities  are 
assumed  instead  to  have  undepreciated,  fixed  life- 
times.    Construction  lags  are  used  to  specify  the 
time  it  takes  to  build  new  capacity.     These  con- 
struction lags  may  be  chosen  individually  for  all 
18  nonenergy  sectors  as  well  as  for  all  energy 
facilities . 

The  exports  are  treated  as  final  demand 
items.     The  imports  are  considered  in  two  parts, 
competitive  and  noncompetitive.     The  noncompetitive 
imports  are  for  those  goods  and  services  for  which 
no  domestic  substitutes  exist.     They  are  treated 
as  a  part  of  the  technology  of  the  consuming  in- 
dustrial sector.     On  the  other  hand,  competitive 
imports  of  goods  and  services  for  which  domestic 
substitutes  do  exist  are  treated  as  activities  that 
can  augment  the  domestic  production.     Finally,  a 
trade  balance  constraint  ties  together  the  amounts 
of  all  Imports  and  exports.     Typically,  we  have 
assumed  over  a  five  year  period  that  the  value  of 
total  exports  be  matched  by  that  of  imports. 

The  labor  force  is  assumed  to  be  exogenously 
given.     Also,  average  labor  productivity  is  assumed 
to  grow  at  an  exogenously  given  rate.     In  sample 
runs,  the  "standard  of  living"  attained  appears  to 
be  very  sensitive  to  this  factor.     In  the  base 
case,  2^  per  year  productivity  growth  is  assumed. 

The  detailed  energy  sector  contains  the  con- 
ventional energy  technologies,   such  as  oil  refinery, 
coal  fired  plant,  etc.,  as  well  as  new  technologies, 
such  as  coal  synthetics,  oil  shale,  plutonium  fired 
reactors,  etc. 

The  description  of  the  energy  sector  includes 
an  accounting  of  reserves  remaining  of  three  ex- 
haustible energy  resources:     oil,  gas  and  uranium. 
For  oil  and  gas,  finding-rate  functions  are  used 
to  specify  the  amount  of  oil  in  place  and  gas  re- 
serves to  be  found  for  a  given  amount  of  drilling 
effort.     The  level  of  drilling  effort  is  endoge- 
nously  determined.     The  advanced  (and  expansive) 
techniques  of  secondary  and  tertiary  recovery  are 
also  defined  in  the  model.    Alaskan  oil  production 
and  the  Trans-Alaskan  Pipeline  System  (TAPS)  con- 
struction are  assumed  to  be  exogenously  given.  For 
natural  uranium,  increasing  facilities  and  man- 
power are  required  to  extract  a  ton  of  ore  as  more 
and  more  is  extracted.     In  particular,  progress- 
ively higher  amounts  of  uranium  mining  and  milling 
capacity  are  needed  to  process  the  poorer  grade 
ore  per  pound  of  uranium  oxide  obtained.     While,  in 
principle,  generalized  linear  programming  could  be 
used  to  model  the  nonlinear  functions,  we,   in  faqt, 
replaced  the  nonlinear  functions  by  broken  line 
fits. 
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The  PILOT  model  can  be  used  in  conjunction 
with  any  desired  social  objective.  Consideration 
of  reasonable  alternative  objectives  requires 
further  investigation.     Some  objectives  may  re- 
quire their  expression  partly  in  the  form  of  extra 
constraints  as  well  as  in  the  values  of  the  coef- 
ficients of  the  maximand.     The  objective  chosen 
in-  the  base  case  in  the  initial  version  of  the 
model  is  the  undiscounted  sum  of  the  gross  na- 
tional consumption  over  all  Uo  years,   subject  to 
I    (l)  a  "monotonic  per  capita"  constraint  which 

states  that  the  average  per  capita  consumption  must 
be  nondecreasing  over  time,  and  (2)  an  initial  con- 
dition which  sets  a  lower  limit  on  first  period 
consumption.     Experimentation  with  other  maximands 
is  possible.     For  example,  we  have  experimented 
with  discounted  gross  national  consumption. 

The  objective  of  PILOT  is  designed  to  per- 
mit one  to  determine  feasible  solutions  to  our 
I    economy — in  particular,  what  level  of  investment 
I    (in  physical  terms)  both  in  the  energy  and  non- 
energy  sectors  is  necessary  in  order  to  have  as 
high  a  standard  of  living  as  possible  for  the  grow- 
ing population. 

Once  the  physical  flows  are  determined,  it 
is  possible  to  solve  a  related  financial  invest- 
ment model.     The  financial  flow  model  calculates 
a  system  of  prices,  taxes,   salaries,  profits,  in- 
terest rates,  etc.  that  is  internally  consistent 
in  the  sense  that  all  economic  agents — consumers, 
producers,  government,  etc., — receive  sufficient 
monies  to  pay  their  expenses  for  the  specified 
physical  flows.     The  prices  generated  by  the  model 
can  be  adjusted  to  be,  at  the  same  time,  nonin- 
flationary,  i.e.,  to  have  the  same  buying  power  as 
base  year  prices.     We  also  are  giving  some  thought 
to  incorporating  in  the  model  production  and  demand 
functions  to  adjust  input/output  coefficients  as  a 
function  of  prices.     In  the  initial  version  of  the 
model  these  coefficients  are  fixed,  however. 

To  illustrate  some  of  the  output  of  the 
model,  a  typical  base  case  assumes  a  2^o  growth  in 
labor  productivity,  20^  limit  on  the  total  amount 
of  energy  purchased  overseas,  certain  limits  on 
the  rate  of  growth  of  coal  production,  etc.    A  base 
case  run  computes  consumption  income  (income  after 
taxes  and  savings).     In  1975  this  income  per  capita 
(in  1975  dollars)  was  about  $^500.     Based  on  these 
assumptions,  the  model  states  it  is  possible  for 
the  country  to  have  a  future  consumption  income 
per  capita  relative  to  consumption  per  capita  in 
1975  as  follows: 

1975    1980    1985    1990    1995    2000    2005  2010 
1.0      1.0      1.2      1.6      1.7      1.8      2.0  2.2 

This  possible  future  can  be  compared  with  that  ob- 
tained from  another  scenario,  which  is  the  same  as 
the  base  case  but  restricts  the  use  of  nuclear 
power  plants.    This  naturally  results  in  a  lower 
achievable  per  capita  consumption  income.  Relative 
to  1975,  the  results  are  as  follows: 

1975    1980    1985    1990    1995    2000    2005  2010 
1.0     1.0     1.2     1.5     1.5     1.5     1.5  1.5 

Comparing  the  two  scenarios  at  the  year  2010,  we 
have  2.2  vs.  1.5,  i.e.,  the  nuclear  restriction 


could  reduce  the  " standard  of  living"  achievable 
by  2010  by  30^.     This  conclusion  has  been  criti- 
cized because  the  model  assumes  that  consumption 
patterns  of  people  at  different  income  levels  will 
remain  unchanged  (i.e.  they  won't  practice  conser- 
vation or  change  their  life  styles)  and  that  pro- 
duction methods  will  be  no  more  efficient  in  the 
use  of  energy  in  the  future  than  they  are  today. 
This  criticism  we  feel  has  merit  and  we  are,  ac- 
cordingly, considering  revisions  in  the  model  to 
include  more  substitution  and  conservation  possi- 
bilities. 

Because  of  excellent  liaison  with  other 
groups  working  in  the  energy  field,  we  anticipate 
that  the  proposed  physical  flow  model  will  contri- 
bute to  the  formulation  and  solution  of  the  more 
detailed  specialized  models  under  development 
elsewhere.     In  particular,  the  PILOT  model  is  one 
being  compared  with  other  models  by  the  newly 
formed  U.S.  Energy  Modeling  Forum  in  its  examina- 
tion of  the  feedback  effects  from  the  energy  sec- 
tor on  the  economic  growth. 

Solving  Multi-Time-Period  Models* 

Solving  energy  models  by  commercial  linear 
programming  software  is  proving  to  be  expensive. 
Large-scale  techniques,  such  as  those  under  devel- 
opment at  the  Systems  Optimization  Laboratory,  are 
currently  under  test  to  see  if  they  can  help  solve 
these  models  more  efficiently. 

Conceptually,  the  decomposition  principle 
[8]  has  proved  to  be  a  natural  approach  to  break- 
ing up  large  systems  and  to  decentralized  decision 
making.     So  far,   computational  experience  has  been 
limited,  but  it  is  known  that  several  devices  can 
be  effectively  combined  with  the  decomposition 
principle  to  accelerate  the  iterative  process. 
Classic  research  along  these  lines  can  be  found  in 
the  work  of  Rosen  [17],  Beale  [3],  Gass  [9],  Bell 
[k],  Abadie  [1],  Bennett  [5],  as  well  as  in  the 
joint  work  of  Wolfe  and  Dantzig  [8].    Areas  of 
SOL  research  include     (l)  intertemporal  models 
with  staircase  structures,     (2)  the  continuous 
simplex  method  for  linear  control  problems  with 
state-space  constraints,  and    (3)  general  large- 
scale  dynamic  nonlinear  problems. 

Recently,  experiments  have  been  conducted 
at  SOL  comparing  the  Decomposition  Principle  ap- 
proach with  a  special  variant  of  the  simplex  method 
known  as  Generalized  GUB.     These  tests  were  limited 
in  nature  but  indicated  that  Generalized  GUB  is 
superior.     Our  research  on  dynamic  systems  is 
therefore  examining  variants  of  the  simplex  method 
as  well  as  special  methods  for  decoupling  stair- 
case and  block-triangular  systems.     See  Figure  2. 
We  plan  to  compare  these  approaches  with  the  nested 
decomposition  algorithm  of  James  K.  Ho  and  Alan  S. 
Manne  for  staircase  systems  [12]. 

Staircase  systems  have  historically  proven 
to  be  very  difficult,  usually  requiring  a  dispro- 
portionately large  number  of  simplex  iterations 
to  solve. 


-X- 

This  section  is  based  on  a  summary  prepared  by 
J.  A.  Tomlin. 
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FIGURE  2.     The  Staircase  Structure  of  the  PILOT  Energy  Model 


J. A.  Tomlin  of  SOL  has  experimented  with 
a  partial  decoupling  of  time  periods  within  a 
model  by  relaxing  the  intertemporal  constraints. 
The  expectation  is  that  such  a  relaxation  will 
result  in  a  more  easily  solvable  model,  whose 
solution  can  be  used  as  a  starting  point  for  the 
real  model,  using  some  "gradual"  approach  to  re- 
store the  intertemporal  constraints.    As  might  be 
expected,  the  results  of  this  approach  are  quite 
problem  dependent,  and  sensitive  to  the  degree  and 
the  kind  of  relaxation  employed.     It  appears  that 
tightly  constrained  economic  planning  models,  of 
the  type  available  to  us  for  these  experiments, 
require  a  more  sophisticated  approach.  Other 
types  of  staircase  models  are  being  acquired  to 
further  test  this  idea. 


One  of  the  more  promising  methods  that 
have  been  investigated  for  reducing  solution  time 
for  dynamic  models  involves  several  modifications 
to  the  simplex  method  designed  to  take  advantage 
of  the  special  properties  and  behavior  of  such 
models.     The  essential  property  of  interest  is  the 
tendency  of  the  same  type  of  activity  to  be  basic 
over  several  successive  time  periods.     It  there- 
fore seems  desirable  to  introduce  a  profitable 
type  of  activity  in  as  many  time  periods  as  possi- 
ble simultaneously.     To  do  this  M.A.  Saunders  and 
J. A.  Tomlin  have  explored  variants  of  the  reduced- 
gradient  method  (a  nonlinear  programming  algorithm 
already  implemented  by  Murtagh  and  Saunders  in 
MINOS  [1^])  on  these  linear  problems  to  change 
several  nonbasic  variables  simultaneously,  in 
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contrast  to  the  standard  simplex  method  which 
changes  only  one  nonhasic  variable  at  a  time.  To 
ensure  that  the  correct  nonbasic  variables  are 
used,  a  "special  pricing"  technique  is  employed. 
When  the  problem  is  read  in,  similar  activities 
in  different  time  periods  are  identified  (from  the 
column  or  variable  names)  and  linked  by  a  circular 
list.     Thus  when  an  activity  is  priced  out  and 
found  to  have  a  favorable  gradient,  the  correspond- 
ing vectors  in  other  time  periods  can  be  easily 
found  and  examined,  and,  if  satisfactory,  included 
as  candidates  to  be  changed.     It  is  then  possible 
to  make  a  step  which  introduces  an  activity  in 
several  successive  time  periods  simultaneously. 

Preliminary  experiments  with  the  above 
approach  have  led  to  a  reduction  of  20-30/o  in 
Phase  II  iterations  when  compared  to  the  standard 
simplex  method  applied  to  the  type  of  economic 
planning  models  referred  to  above.     It  is  clear 
that  many  tactical  variations  of  the  scheme  need 
to  be  studied. 

If  it  is  advantageous  to  bring  in  an  activ- 
ity simultaneously  in  many  time  periods,  then, 
conversely,  it  should  be  advantageous  to  be  able 
to  also  force  an  unprofitable  activity  to  its 
lower  bound  in  several  time  periods  simultaneously. 
This  is  rather  more  difficult,  since  one  cannot 
tell  whether  a  whole  group  of  variables  can  reach 
their  bounds  while  maintaining  a  feasible  solution 
(at  least,  not  without  incurring  a  heavy  computa- 
tional cost) .     The  approach  we  have  implemented 
identifies  groups  of  unprofitable  nonbasic  activ- 
ities close  to  their  bounds  and  forms  a  direction 
vector  scaled  in  such  a  way  that  if  one  of  these 
variables  reaches  its  bound  then  all  of  them  do. 
This  method  has  had  some  success  in  reducing  the 
iteration  count  when  combined  with  the  "special 
pricing"  described  above.    Again  many  variations 
are  possible,  and  very  considerable  further  ex- 
perimentation is  required  to  refine  the  methodology 
and  expand  on  the  promising  results  achieved  so 
far. 

While  we  have  concentrated  on  means  of 
improving  the  solution  path  (iteration  count) 
above,  another  means  of  improving  solution  tech- 
niques for  staircase  models  is  to  speed  up  each 
step  of  the  simplex  algorithm  by  taking  advantage 
of  the  special  structure  of  the  basis  for  such 
problems.    As  early  as  195^,  G.B.  Dantzig  [6] 
pointed  out  that  such  problems  exhibit  an  "almost" 
square  block-triangular  basis  structure  which 
could  be  decomposed  into  a  product  of  a  true 
square  block-triangular  matrix  and  another  matrix 
with  only  a  few  columns  differing  from  the  unit 
matrix.     The  advantage  of  this  procedure  is  that 
square  block-triangular  matrices  can  themselves 
be  very  efficiently  decomposed  to  give  a  very 
sparse  factorization  of  the  basis.    A  version  of 
this  method,  employing  modern  factorization  tech- 
niques, has  been  implemented  at  SOL  by  A.F.  Perold 
(a  graduate  student)  and  is  now  being  tested  on 
problems  of  significant  size.     Early  indications 
are  that  this  method  of  handling  the  basis  can  in- 
deed be  more  efficient  than  a  direct  treatment 
which  does  not  take  the  staircase  structure  of 
models  into  account.     Perold' s  code  is  based  on 
sol's  LPMl  linear  programming  code,  as  was  the 
nested  decomposition  code  by  J.K.  Ho  and  A.S.  Manne 
[12]  for  the  same  class  of  problems.     This  should 


facilitate  comparison  between  the  specialized 
simplex  and  decomposition  approaches  to  these 
models. 

An  alternative  to  all  of  the  above  numer- 
ical treatments  of  discrete  multi-time-period 
models  is  to  attempt  to  solve  the  underlying  con- 
tinuous time  problem,  which  can  be  thought  of  as  a 
linear  control  problem  with  state-space  constraints. 
G.B.  Dantzig  and  R.E.  Davis  are  investigating  a 
"continuous  simplex  method"  for  such  problems.  Al- 
though progress  has  been  made,  much  work  remains 
to  be  done. 
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This  year  the  United  States  is  two 
centuries  old,   as  you  may  be  tired  of 
hearing  by  now.     Even  if  one  goes  back  to 
the  early  English  settlements,   this  land 
is  only  about  three  and  one-half  centuries 
old.     That  is  not  very  long,   really.  In 
terms  of  generations,   about  12  to  14.  The 
small  city  near  Vienna  where  I  have  been 
living  for  nearly  two  years  goes  back  to 
the  tenth  or  eleventh  century.     There  are 
many  active  monasteries  in  Austria  and 
some  have  been  in  continuous  operation 
since  the  eighth  or  ninth  century.     So  we 
are  a  young  nation  in  some  ways;   yet  our 
political  institutions  are  among  the 
oldest  in  their  present  form,   and  in  many 
areas  of  technology  and  business  we  have 
led  the  rest  of  the  world. 

The  impact  of  the  U.S.   on  the  world 
has  been  tremendous.     One  must  live  for 
awhile  outside  North  America  to  begin 
really  to  understand  this.     Other  peoples 
have  watched  us--not  necessarily  their 
governments--much  longer  than  we  have 
watched  them,   and  still  do  to  a  greater 
degree.     VJe  began  to  be  a  world  power 
probably  about  the  time  of  the  Spanish- 
American  war,    just  over  three-quarters  of 
a  century  ago.     After  World  War  I,  U.S. 
business  became  a  dominant  force  in  many 
areas  and  grew  with  the  world.     And  it 
has  grown.     I  was  born  just  before  the 
end  of  World  VJar  I  and  the  population  of 
the  world  was  then  much  less  than  two 
billion,   or  less  than  45%  of  what  it  is 
today.     Approximately  the  same  ratio 
holds  for  the  U.S.   and  the  effect  is 
noticeable . 

During  and  after  World  War  II,  of 
course,   the  U.S.   became  the  mightiest 
nation  on  earth--in  almost  any  way  you 
want  to  measure  except  sheer  population 
figures.     Yet  it  was  five  years  after 
that  before  computing  as  we  know  it  to- 
day could  be  said  to  be  in  its  infancy. 
(Vuegraph  1.)     There  were,   of  course, 
antecedents  going  back  much  further,  but 
these  are  mainly  of  historical  interest. 
I  started  in  the  field  at  the  RAND 
Corporation  in  January,    1951.     It  was 
three  years  later  before  we  had  a  real 
computer  and  we  were  among  the  first.  I 
had  lived  almost  half  a  lifetime  before 


getting  into  computing  and  still  my  career 
has  spanned  virtually  its  entire  history. 
So  our  field  is  very  young  indeed. 

Nevertheless,   the  accomplishments  of 
the  computing  field  during  its  first 
quarter  century  are  extremely  impressive. 
Furthermore,   with  apologies  to  our  many 
foreign  friends  and  professional  associ- 
ates,  computing  is  almost  exclusively  an 
American  development.     There  are  three 
technologies  in  which  we  are  still  clearly 
supreme:     telephone  systems,  commercial 
aircraft,   and  computing.      (The  list  is  no 
doubt  longer  but,   I  fear,   getting  shorter 
as  time  goes  on.)     One  flies  almost  every- 
where in  American  aircraft,   and  foreign 
computer  manufacturers  have  all  but  given 
up — with  the  possible  exception  of  Japan 
and  the  new  imitative  line  which  the 
Eastern  bloc  is  attempting  to  create. 
Some  very  good  software  has  come  out  of 
France,   particularly  in  the  MP  field,   and  a 
little  from  England,   but  almost  always  in 
connection  with  American  manufacturers  or 
other  multi-national  corporations.  The 
bulk  of  system  software  comes  from  the  U.S. 
and  a  large  part  of  application  software. 
(Vuegraph  2,     with  extemporaneous  dis- 
cussion of  evolution  of  computing 
technology . ) 

I  think  no  one  in  the   1950s  really 
understood  the  amount  of  effort  required  to 
develop  systems,   application  packages, 
efficient  compilers,   and  so  on.  Almost 
nothing  worked  right  during  the  whole 
frustrating  decade  of  the   '60s.     In  retro- 
spect,  it  is  not  surprising  and  we  could 
all  be  forgiven,   I  think,    for  a  little 
self-praise  that  so  much  was  actually 
accomplished.     But  those  frustrations 
often  left  a  bad  taste,   a  deep  skepticism, 
a  rigidity  of  method  and,   in  some  cases, 
bad  feelings.     One  can  still  see  the 
effects  of  this. 

But  the   '60s  are  behind  us,   and  there 
is  no  point  in  dwelling  on  the  mistakes, 
failures  and  unrealistic  estimates.  The 
solid  accomplishments  far  outweigh  them. 
In  fact,   the  field  has  progressed  so 
rapidly  that  it  is  now  possible  for  one  or 
two  people  to  accomplish  in  a  couple  of 
weeks  what  would  have  been  considered, 
only  ten  years  ago,   a  sizeable  project.  I 
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have  completed  several  such  tasks  in  the 
past  several  months,   and  surprised  even 
myself.     But  this  doesn't  happen  always 
and  everywhere.     It  is  surprising  how 
much  computing  is  still  done  in  old- 
fashioned  ways    (if  we  can  use  that  term 
for  so  young  a  field)    and  how  slowly  newer 
techniques  spread.     At  one  time,   I  fought 
operating  systems--not  because  I  didn't 
understand  the  problems  but,   on  the 
contrary,    I  understood  them  very  well  and 
didn't  trust  anyone  else  to  solve  them  in 
a  way  satisfactory  as  a  base  for  my  work. 
Against  the  advice  of  knowledgeable 
colleagues,   I  long  ignored  telecommu- 
nications,  remote  processing  and  inter- 
active systems--except  for  a  couple  of 
especially  suitable  uses — for  the  same 
reason.     But  in  all  cases,   enough  people 
worked  long  enough  to  make  the  various 
systems  operate  in  a  satisfactory  and  even 
elegant  and  fruitful  manner. 

Of  course,   most  of  us  are  not  inter- 
ested in  the  whole  computing  field.  In 
fact,    it  is  now  so  extensive  that  it  is 
almost  impossible  to  comprehend  it.  How- 
ever,  the  math  programming  field  is  very 
rich  in  itself  and  its  growth  has  both 
paralleled  and  been  a  component  part  of 
the  whole  experience  of  the  computing 
field.     It  had  its  identifiable  beginnings 
at  the  same  time  as  computers,   that  is,  in 
about  1947-8.     That  was  when  Dantzig 
developed  the  simplex  method  in  connection 
with  planning  problems  for  the  U.S.  Air 
Force.     As  with  computers,   about  five 
years  passed  during  early  developments 
before  linear  programming  began  to 
blossom  into  a  practical  technique.  A 
quarter  of  a  century  has  passed  since 
then  and,    for  most  of  this  time,   LP  has 
pushed  the  capacity  of  current  computers 
to  the  limit.     LP  and  the  simplex  method 
are  still  the  foundation  stones  of  the 
whole  field  of  math  programjning  and  the 
vanguard  of  OR  techniques. 

I  first  started  working  with  George 
Dantzig  in  December,    1952.     I  got  the 
idea  somehow  that  I  was  supposed  to  build 
an  automated  simplex  method  and  that  the 
principle  problems  were  the  amount  of 
arithmetic,  maintaining  sufficient  pre- 
cision,  and  the  large  storage  required. 
We  started  out  modestly.     I  tried  for  25 
rows  on  the  old  Card-Programmed- 
Calculator  and  upped  this  to  40  when  we 
worked  out  the  product  form  of  inverse. 
A  year  later  we  were  trying  to  do  100 
rows  on  the  IBM  701   and,  when  I  solved 
Alan  Manne ' s   101 -row  gasoline  blending 
problem,   I  thought  we  had  really  accom- 
plished something.     We  went  to  256  rows 
on  the  704,   later  512  and  then  1024  rows 
on  the  IBM  7090.     It  was  ten  years  after  I 
started  before  we  had  a  reasonably  auto- 
matic system  for  that  size  problem,  and 
then  not  always.     Furthermore,   the  data 
processing  and  service  routines  out- 
weighed the  algorithms.     I  didn't  know 
what  I  was  getting  into  in  1952. 

But  then  I  suspect  neither  did  George 


Dantzig.     The  development  of  mathematical 
programming  in  breadth,   depth,   prestige  and 
applications  during  the  1960s  was  fantas- 
tic.    VJe  had  our  share  of  frustrations  — 
decomposition,    in  spite  of  much  initial 
interest,  extensive  software  efforts,   and  a 
later  revival,   has  still  not  been  per- 
fected nor  applied  on  a  large  scale;  matrix 
generation  techniques,   with  perhaps  even 
more  intensive  efforts,   still  have  no 
widely  accepted  theoretical  or  practical 
basis.     This  area  is  now  commanding  much  of 
my  own  attention.     However,   we  have  an 
international  Mathematical  Programming 
Society  with  its  own  Journal  and  which 
draws  a  bigger  attendance  at  its  symposia 
than  the  whole  computing  community  in  the 
mid-50s.     Professor  Dantzig 's  work  has 
been  recognized  by  a  Presidential  award. 
All  this  and  much  more  is  well  known  to 
you . 

Perhaps  not  so  well  known  is  the 
institute  where  I  am  presently  working. 
It  is  called  the  International  Institute 
for  Applied  Systems  Analysis  or  IIASA. 
This  represents  a  further  extension  of  the 
field  of  operations  research  to  an  inter- 
national setting.     It  is  not  clear  what  we 
should  call  this  whole  field  of  which  we 
are  a  part.     It  is  not  easy  to  define 
systems  analysis  concisely,    in  fact  one 
program  at  IIASA  is  to  produce  a  set  of 
definitive  publications  for  the  field. 
Perhaps  of  more  immediate  interest  is  the 
genesis  and  make-up  of  IIASA.     It  has  been 
in  formal  existence  only  since  late  1972 
but  this  was  preceded  by  several  years  of 
planning,   negotiations  and  agreements. 
Professor  Howard  Raiffa  from  Harvard  was 
its  first  director.     In  his  speech  at  the 
Council  meeting  in  November  1975,   he  said 
the  following:     "I  believed  in  the  con- 
cept of  IIASA  when  I  was  first  introduced 
to  the  idea  by  McGeorge  Bundy  back  in 
1968.    ...Scientific  detente  then  as  now  is 
critically  necessary  if  we  are  to  have  a 
sane  world."     To  what  extent  political 
detente  was  a  driving  force  in  the  for- 
mation of  IIASA  is  not  clear  to  me--my 
first  introduction  to  the  Institute  was  in 
October   1973  when  Dantzig  asked  me  to 
attend  one  of  its  organizational  con- 
ferences.    In  any  event,   it  truly  is 
international  in  character  with  particular 
emphasis  on  shared  sponsorship  and 
direction  by  both  East  and  West.     The  U.S. 
and  the  U.S.S.R.   each  contribute  equally 
and  in  by  far  the  largest  amounts.  There 
are   13  or   14  other  countries  who  each 
contribute  an  equal  but  much  smaller 
amount.     This  funding  is  not  done  directly 
as  government  grants  but  through  Academies 
of  Science  or  equivalent  organizations  in 
the  various  countries.     Austria  contri- 
butes in  a  special  way  through  providing 
facilities  at  very  low  or  even  token  rates, 
which  is  in  conformity  with  their  policy 
of  becoming  an  international  meeting 
ground.     The  Institute  is  housed  in  the 
refurbished  summer  palace--or  Schloss,  as 
it  is  called — of  Maria  Theresa,  at 
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Laxenburg  about  10  miles  south  of  Vienna. 
(It  is  quite  an  experience  working  in  the 
midst  of  imperial  splendor.)     As  to  its 
mission,    I  can  do  no  better  than  to  again 
quote  Raiffa:      "I  believe  IIASA  has  a 
mission.     It  must  keep  its  doors  open,  so 
that  scientists  who  have  a  broad  vision  of 
tomorrow's  world  can  communicate  with  each 
other.     VJe  cannot  afford  to  let  these 
doors  shut  tightly  because  of  some 
ephemeral,   trivial  problems.     We  must 
learn  today  how  we  can  jointly  grapple 
with  the  stream  of  equally  devastating 
problems  that  will  surely  shake  our 
societies  in  the  not  too  distant  future." 

Raiffa 's  remarkable  performance  in 
getting  the  Institute  under  way  has  become 
almost  legendary.     He  was  replaced  a  year 
ago  by  Dr.   Roger  Levien  from  the  RAND 
Corporation.     The  chairman  remains 
Professor  Jermen  Gvishiani  from  the  USSR 
Academy  of  Sciences. 

For  the  past  few  months,    I  have  been 
working  with  the  Energy  Research  Program, 
headed  by  Professor  VJolf  Hafele  from  the 
Federal  Republic  of  Germany  who  is  also 
deputy  director  of  IIASA.     Hafele  was  one 
of  West  Germany's  leading  experts  in 
nuclear  energy  and  is  well  known  in  the 
energy  field.     The  Energy  Program  is 
currently  the  largest  at  IIASA  with  about 
two  more  years  to  go  under  present  plans. 
But  there  is  not  time  for  more  details 
about  IIASA' s  organization  and  operations. 

There  are  many  aspects  to  the  Energy 
Program  but  one  is  the  formulation, 
implementation,   coordination  and  opera- 
tion of  a  set  of  models  with  emphasis  on 
strategies  for  a  transition  from  fossil 
to  nuclear  fuels.     At  least  that  is  the 
way  it  started  out.     Hafele  and  Alan 
Manne ,  with  some  assistance  also  from 
Dantzig,   produced  an  energy  supply  model 
in  197U.     It  was  a  dynamic  LP  model  and 
there  have  been  several  variants  and  sub- 
sidiary models.     It  remains  the  center- 
piece of  the  modelling  system  but  many 
other  areas  have  had  to  be  investigated 
and  models  are  under  development  for  some 
of  them.     These  include  energy  resources, 
energy  demands,   long-range  economic 
forecasts    (the  last  two  being  among  the 
most  difficult)    and  secondary  investment 
requirements.     Hafele  asked  me  to 
coordinate — he  calls  it  orchestrate--the 
computerization  of  the  entire  modelling 
effort.     In  a  practical  sense,   this  turns 
out  to  involve  somewhat  more  than  merely 
computer  aspects.     When  one  is  trying  to 
coordinate  computations  with  a  set  of 
models  variously  formulated  by  Russians, 
Germans,  Americans,   Austrians,  Frenchmen 
and  others,   one  must  become  involved  with 
concepts,   definitions,  nomenclature, 
feasibility  of  approach  and  so  on.  One 
must  also  push  aside  with  some  diplomacy 
certain  formulations  and  levels  of  detail 
which  are  unsuitable  for  the  overall  task. 

Many  of  the  models  are  LP  models. 
The  Russians,   in  particular,   use  LP  to  an 
almost  unbelievable  extent  in  their 


national  planning  and  so  it  is  under- 
standable that  they  take  the  same  approach 
to  international  models.     We  also  hope  to 
have  not  only  many  component  models  but 
versions  for  at  least  a  few  regions  of  the 
world  with  a  global  model  of  some  kind  on 
top.     Hafele  has  misgivings  about  the 
appropriateness  of  LP  for  some  of  these 
purposes,   and  so  do  I,   but  there  is 
clearly  much  LP  work  to  do  in  any  event. 

It  is  clear  that  we  will  have  to 
solve  some  models  many  times  on  an  inde- 
pendent basis  in  order  to  get  started.  It 
is  my  hope  that  we  may  learn  enough  this 
way  to  be  able  to  circumscribe  the  meaning- 
ful ranges  on  parameters  and  devise 
scenario  parts  which  can  be  combined  for 
various  cases.     We  don't  yet  know  how  to 
handle  the  entire  system  of  models  in  one 
piece  so  why  not  cut  it  up  and  solve  some 
of  the  pieces  a  number  of  times  to  get 
boundary  conditions  for  the  others.  Maybe 
you  call  this  suboptimization  but  it  is 
better  than  no  solution.     However,   for  this 
to  be  meaningful,   as  many  of  the  models  as 
possible  must  be  put  into  a  consistent 
framework  with  respect  to  style,  structure, 
nomenclature,   indexing  conventions  and  so 
on . 

However,  my  main  difficulty  right  now 
is  that  I  don't  have  a  suitable  computer 
even  for  the  pieces.     I  have  the  software, 
and  there  is  much  other  useful  software 
available  to  us,   but  not  the  computers. 
Considering  the  nature  of  the  place  where 
I'm  working,   this  shouldn't  be  so.     I  am 
now  more  than  a  little  envious  of  all  the 
hardware  I've  seen  in  corporate  head- 
quarters,  investment  houses--yes,  and 
universities — which  people  are  really  just 
playing  around  with.     I  don't  want  to  take 
it  away  from  them  because  there  are 
beneficial  side  effects--a  point  I'll  re- 
turn to  at  the  end.     But  there  is  some- 
thing badly  wrong  with  a  world  economic 
order  that  gives  its  best  tools  to  those 
who  really  don't  have  an  essential  need 
and  denies  them  where  they  are  needed  for 
studies  with  global  implications. 

At  the  Math  Programming  Symposium  in 
Budapest  last  August,   I  listened  to  a 
Hungarian  give  a  paper  on  a  state  planning 
project  he  had  been  involved  in.  They 
worked  over  a  year  and  went  through  all 
kinds  of  devious  methods  to  solve  their 
model.     Essentially,   it  was  a  big  GUB 
problem.     All  he  needed  was  a  few  hours 
with  MPS-III  on  a  360/65  or  better  and  the 
problem  would  have  been  solved.     I  spent 
some  hard  years  developing  GUB  in  MPS-III. 
Here  was  someone  who  really  needed  it  but 
couldn't  get  at  it  for  political  and 
economic  reasons.     It  strikes  me  that  there 
is  also  something  wrong  with  developing 
tools  and  then  not  making  them  available 
when  they  are  needed.     The  money  is  spent 
anyway.     If  the  Hungarians  do  a  better  job 
of  planning  with  our  tools,   how  can  this 
possibly  hurt  us?     Oh,    I'm  fully  aware  of 
the  differences  between  East  and  West  and 
the  difficulties  of  normal  economic 
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relationships.     Somehow  this  doesn't  seem 
very  important  anymore. 

Perhaps  you  think  I'm  either  off-base 
or  just  naive.     I  think  not.     As  short  as 
the  history  of  our  field  is,   it  will  be  a 
shorter  time  still  before  enormous  human 
calamities  begin  to  occur.     I  don't  mean 
big  wars--God  forbid  the  world  becomes 
embroiled  in  any  more  of  those,  although 
I  could  point  out  a  couple  of  disadvan- 
tages we  now  face  if  it  did  happen.  But 
calamities  are  already  happening.  Tens 
or  hundreds  of  thousands  of  people  have 
already  starved  in  Bangladesh  and  middle 
Africa,   in  just  one  season.     Some  people 
now  think  India  is  a  hopeless  case. 
Closer  to  home,   the  United  Kingdom  is  a 
catastrophe.     V?ithin  living  memory,  the 
British  Empire  was  the  greatest  power  on 
earth.     It  appears  that  England  may  soon 
be  reduced  to  a  poor  country.     Even  some 
Frenchmen  feel  bad  about  it.     Here  at 
home,   we  have  had  one  oil  crisis — you 
took  it  seriously  if  you  were  living  in 
Boston  at  the  time.     There  will  be  more. 
It  appears  likely  that  OPEC  oil  prices 
will  go  to  $15.00  per  barrel  this  month. 
That  is  not  yet  a  catastrophe  for  the 
U.S.   but  prices  will  go  higher  and  higher 
unless  some  drastic  changes  are  made  in 
our  economic  policies — that  is.  Project 
Independence  and  much  more  must  become  a 
reality . 

Eduard  Teller  visited  IIASA  last 
month  and  gave  a  most  wonderful  talk  on 
alternate  sources  of  energy.     He  said  he 
didn't  know  whether  the  really  serious 
problems  of  the  world  could  be  solved 
but  he  was  sure  of  one  thing — they 
couldn't  be  solved  without  plenty  of 
energy  available.     Furthermore,  much  of  it 
has  to  be  in  a  form  which  is  usable  in 
small  doses  in  many  places.     For  example, 
it  would  do  no  good  to  put  a  thousand 
megawatt  reactor  in  an  undeveloped 
country.     They  simply  don't  have  the 
wires  to  carry  the  electricity  away  or  to 
distribute  it  to  all  the  places  it  is 
needed.     The  only  form  we  presently  have 
which  is  usable  this  way  is  petroleum. 
But  if  the  U.S.   keeps  gobbling  up  4555 
of  the  world's  production,   the  price  will 
keep  going  up  putting  it  out  of  reach  of 
many  nations  who  can  barely  pay  for  what 
they  use  now.     Furthermore,   we  only 
hasten  the  day  and  postpone  the  prepara- 
tion for  it  when  we  will  begin  to  run 
short  ourselves.     And  unless  we  drasti- 
cally change  our  style  of  living,  that 
will  be  a  catastrophe  here. 

I  bring  these  things  to  your  atten- 
tion to  indicate  the  kind  of  problems 
the  world  must  be  about  solving--and 
immediately.     I  will  add  to  Teller's 
remark  and  say  the  really  serious  problems 
can't  be  solved  without  effective  use  of 
computers  on  a  wide  scale.     Not  for 
keeping  accounts,   or  printing  electric 
bills  and  bank  statements,   or  even  for 
simulating    the  aerodynamic  qualities  of  a 
new  airfoil,   but  for  systems  analysis  or 


whatever  you  wish  to  call  it,  seriously 
and  expertly  applied.     That  may  not  be 
sufficient  but  it  is  necessary.  Further- 
more,  the  studies,   planning  and  decisions 
must  be  international  in  scope  and  that 
means  across  East-VJest  boundaries,  among 
others.     Otherwise,  either  they  will  be 
ineffective  or  there  will  be  counterplays 
which  come  to  the  same  thing.     And  the 
world  has  run  out  of  time  for  playing  such 
games . 

I  was  talking  to  a  senior  French 
analyst  the  day  after  Teller's  talk  and  we 
got  to  philosophizing  a  bit.     I  said  I 
thought  there  would  be  such  dislocations  in 
the  world  during  the  next  quarter  century 
that  no  one  could  predict  how  the  global 
economic  order  would  change.     He  agreed 
and  even  gave  some  examples  from  history. 
I  then  added  that  I  thought  a  catastrophe 
in  one  area  would  be  bound  to  have  an 
effect  on  the  rest  of  the  world.     Here  he 
shook  his  head.     No,   he  said,    if  a  million 
people  died  in  India  or  Africa,  some 
people  in  Paris  would  send  a  little  money 
to  the  Post  Office  for  a  relief  fund  but 
otherwise  life  would  go  on  just  the  same. 
No  one  would  really  care.     Did  the  French 
care  when  Spain  was  living  in  poverty? 
No,   it  was  a  fine  place  for  cheap  vaca- 
tions . 

Still,   I  must  believe  that  the  cumu- 
lative effect  of  a  series  of  calamities 
will  be  widespread,   the  more  so  as  they 
begin  to  be  chronic  rather  than  excep- 
tional.    The  decline  of  British  power,  for 
example,    is  probably  a  tragedy  of  greater 
consequence  than  starvation  in  some  un- 
developed country.     Why?     Because,  whether 
the  British  were  liked  or  not,   they  nearly 
always  left  things  in  a  better  state  than 
they  found  them.     If  they  can  not  now 
manage  their  own  affairs,   the  world  has 
lost  one  force  for  improvement  and  gained 
still  another  problem.     At  the  least,  it 
should  be  a  sobering  lesson  to  us. 

I  think  all  of  us,   even  including 
knowledgeable  scientists  and  high  officials 
of  state,   have  a  tendency  to  disbelieve 
what  seems  monstrous  and  new.     We  think  of 
such  things  abstractly  unless  and  until 
they  come  down  to  affecting  our  daily 
lives.     We  might  do  well  to  remember  Marie 
Antoinette  and  the  Romanovs.     Optimism  and 
courage  are  great  virtues  but  they  lie 
very  close  to  f oolhardiness .     It  is 
commendable  to  be  astute  and  shrewd  in 
business  but  this  should  be  guided  by 
vision . 

The  basic  outline  of  approaching 
world  problems  was  recognized  twenty  years 
ago.     The  late  J.D.  Williams  gave  some 
very  easily  understood,   dramatized  papers, 
unfortunately  probably  heard  or  read  by 
very  few  people.     I  attended  a  series  of 
seminars  at  the  American  Management 
Association  in  1961   and  one  lecture  was  a 
partly  humorous  but  highly  convincing 
dramatization  of  the  alarming  positive 
rates  of  change  in  a  variety  of  areas. 
(I  regret  that  I  have  forgotten  the 
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speaker's  name;   he  was  from  VJashington . ) 
He  presented  this  regularly  to  a  variety 
of  business  and  government  gatherings. 
Perhaps  he  made  it  too  easy  to  laugh. 

History  tells  us  over  and  over  that 
no  arrangement  is  permanent.     It  is 
foolish  to  believe  that  because  we  are 
strong  we  can  continue  to  consume  far  more 
than  our  proportionate  share  of  the  world's 
goods.     If  it  were  only  a  matter  of  others 
attaining  the  degree  of  affluence  we  enjoy, 
then  it  might  be  considered  reasonable  for 
the  technological  leader  to  have  the  most. 
This  has  been  pretty  much  our  attitude. 
Western  Europe  has  in  fact  gone  a  long  way 
toward  catching  up  though  their  per  capita 
consumption  of  energy,   for  example,  is 
slightly  less  than  half  what  ours  is.  In 
other  areas  there  is  a  discrepancy  of  an 
order  of  magnitude  and,    in  some  cases, 
almost  two  orders.     Some  of  these  areas 
must  increase  energy  consumption  just  in 
order  to  provide  food.     But  the  ready 
supply  of  easily  distributed  energy  forms 
is  becoming  limited.     Our  own  strength 
and  prosperity  depend  on  highly  developed 
uses  of  these  same  forms.     We  need  not 
moralize  but  only  consider  being  placed 
in  competition  with  people  who  must  have 
some  of  what  we  want  simply  to  survive. 
In  only  another  25  years,   the  population 
of  the  world  will  have  increased  another 
50%,   that  is,   by  more  people  than  were 
living  when  I  was  born.     If  any  substan- 
tial number  of  these  people  improve  their 
living  standard  by  even  10%  to  20%-- 
which  is  almost  nothing--the  strain  on 
world  capacities  will  be  fantastic. 

Let  me  put  it  another  way.  There 
have  been  extensive,   serious  and  competent 
studies  of  world  problems  for  a  number  of 
years.     Most  of  these  are  based  on  1967, 
occasionally  1970,   figures.     They  talk 
about  5  year,    10  year  or  25  year  pro- 
jections, with  a  few  attempts  at  long- 
term  forecasts  for  50  years  or  longer. 
But  it  is  this  minute  very  close  to  1977. 
More  than  five  years  have  passed  since 
1970,   nearly  ten  since  1967,   and  the 
leaves  of  the  calendar  keep  flipping.  It 
will  soon  be  1980  and  then  1985.  Already 
world  population  will  be  at  about  5 
billion.     When  one  measures  this  against 
the  time  for  building  an  industrial  com- 
plex,  improving  a  transportation  system, 
resolving  political  issues,  achieving 
international  accord  for  even  modest 
pro jects--well  it  is  clear  that  many 
babies  born  this  very  day  are  doomed 
already.     There  is  no  such  thing  as 
opportunity  to  improve  their  lot  or  free- 
dom to  choose  their  career. 

So  what  can  we  do?     Over  the  past  two 
decades,   we  have  developed  very  powerful 
tools  for  analysis  and  planning.     A  great 
deal  of  hardware  and  software  now  works 
and  works  well.     Modelling  techniques, 
though  far  from  perfect,   are  well  advanced. 
We  are  now  in  a  position  technically  to 
make  real  inroads  on  world  problems,  to 
undertake  the  kind  of  work  which 


originally  motivated  the  development  of 
large  computers,   and  planning  and  ana- 
lytical techniques.     But  we  have  not  yet 
learned  to  coordinate  their  use  consis- 
tently on  a  broad  scale,   nor  to  convince 
the  actual  decision-makers  that  they  can 
place  considerable  reliance  on  results. 

We  must  find  ways  to  make  the  use  of 
our  marvellous  tools  more  effective. 
First  of  all,   this  means  use  of  inter- 
active systems  on  both  large  and  small 
computers,   tied  together  with  networks  in 
a  consistent  manner.     All  these  things 
exist,   piecemeal,   and  are  in  daily, 
reliable  use.     We  have  the  technology  but 
we  lack  coordination  and  singleness  of 
purpose.     In  the  West,   this  is  due  to 
business  competition — including  the 
universities  and  research  centers  which 
are  nothing  more  or  less  than  large 
businesses.     In  Europe,   the  discontinu- 
ities between  traditional  nations  and 
peoples  add  further  dichotomies.  Between 
the  free  market  countries  and  the 
planned-economy  bloc,     fundamental  policy 
differences  exist.     Comparing  all  these 
more  or  less  developed  countries  with 
underdeveloped  countries,   one  finds  that 
the  latter  don't  even  understand  the  game. 
They  know  something  of  the  basic  tools  but 
very  little  about  motives,   incentives,  and 
other  driving  forces.     Of  course,  there 
are  individual  exceptions. 

Usina  comouters  with  common  conventions 
for  analvtical  work  can  be  a  very  strong 
unifying  force.     No  doubt  much  of  the  work 
of  this  kind  has  been  of  small  value  but 
if  it  brings  diverse  peoples  together  a 
little,   this  is  in  itself  an  accomplish- 
ment.    Moreover,   some  of  the  efforts  are 
certainly  first  rate  and  fruitful.  The 
more  widely  they  can  be  understood  and 
appreciated,   the  more  quickly  we  can  make 
effective  use  of  them  on  a  scale  appro- 
priate to  the  global  problems  we  are  all 
facing.     In  short,   our  discipline  needs 
more  standardization. 

Of  course,   standards  mean  different 
things  in  different  contexts.     Some  are 
essentially  a  legalization  of  what  is 
already  considered  good  practice,   such  as 
state  commissions  and  examining  boards 
over  a  wide  range  from  barbers  to  lawyers 
and  doctors.     For  more  technically  com- 
plicated areas  or  when  massive  or  tedious 
data  must  be  gathered,  we  have  such  in- 
stitutions as  the  National  Bureau  of 
Standards  and  the  American  Standards 
Association.     But  with  apologies  to  our 
hosts,   these  approaches  do  not  apply  to 
the  present  discussion. 

De  facto  standards — usually  due  to 
economic  forces--are  often  among  the  most 
effective.     Several  years  ago  there  was 
much  discussion  of  electronic  modes  of 
recording  on  magnetic  tape.     Many  people 
claimed  IBM's  methods  were  not  very  good. 
I  don't  know  whether  they  were  or  not  but 
they  worked  and,   to  stay  in  business, 
other  manufacturers  had  to  be  compatible 
with  them.     Now  one  can  put  a  tape  in  his 
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brief-case,   travel  over  a  large  part  of 
the  world  and  have  the  tape  read  by  a 
computer  in  some  remote  place.     This  is  a 
great  advantage.      I  think  no  one  would 
claim  today  that  the  MPS/360  input  formats 
are  very  good,  but  for  the  same  reasons, 
they  are  an  effective  standard.     As  a 
result,   problems  can  be  shipped  around  and 
solved  on  various  computers.  Furthermore, 
it  only  takes  one  word  to  describe  the 
format,   another  great  advantage. 

Unfortunately,  economic  forces  can 
also  work  the  other  way.     One  sometimes 
suspects  that  economists,  operations 
researchers,  and  other  system  analysts-- 
not  to  speak  of  software  designers--are 
just  as  happy  if  their  techniques  cannot 
be  measured  and  compared  too  clearly  and 
precisely,   another  result  of  intense 
competition.     But  competition  won't  solve 
today's  problems;   collaboration  is  much 
more  to  the  purpose. 

In  the  task  I'm  presently  engaged  in, 
it  is  necessary  to  devise  some  sort  of 
standards  and,   one  way  or  another,   to  en- 
force their  use.     To  do  this,   they  must  be 
logically  consistent  and  explainable-- 
which  is  perhaps  the  best  kind  of  stand- 
ardization.    I  have  discovered  that  this 
is  not  an  easy  task  but,  more  importantly, 
the  attempt  leads  to  considerable  clarifi- 
cation of  fundamental  ideas.     Let  me 
illustrate  this  with  the  problem  of  iden- 
tifiers . 

As  most  or  all  of  you  know,   an  LP 
model  has  got  to  have  unique  row  and 
column  identifiers  and,   in  a  practical 
sense,   these  are  limited  to  8  characters 
with  only  36  to  38  graphics  available. 
Since  some  formulations  require  five  or 
six  indices,   one  must  be  extremely  frugal 
with  encoding  schemes.     This  forces  one 
to  study  very  carefully  just  what  it  is 
that  one  is  trying  to  represent  and  this, 
in  turn,   leads  to  some  rather  deep  con- 
siderations.    One  runs  into  such  problems 
as  what  constitute  substances,   what  can 
be  called  processes,    is  there  a  funda- 
mental distinction  between  primary  and 
secondary  conversions,  what  are  capac- 
ities and  how  are  they  related  to  other 
variables?     What  is  capital  and  what  is 
labor?     These  last  two  gave  me  consider- 
able trouble  but  finally  I  thought  I  had 
them  correctly  classified.     I  spent  most 
of  a  Sunday  reading  the  encyclopaedia  to 
see  if  my  conclusions  were  correct  and 
found  that  essentially  they  were.  I 
learned  much  more  which  I  would  never  have 
comprehended  if  I  hadn't  been  trying  to 
figure  out  how  to  automate  matrix  and  re- 
port generation  and  to  make  them  con- 
sistent over  a  range  of  models.  Some 
economists  would  do  well  to  go  through  the 
same  drill  to  clarify  their  own  thinking. 

Once  the  manipulation  of  the  models 
is  conveniently  automated,   we  can  then 
begin  paying  attention  to  the  really  im- 
portant questions--the  validity  of  data 
and  its  meaning,   the  effect  of  variations, 
i.e.   sensitivity  analysis,   the  effect  of 


changes  in  hypotheses  and  how  this  relates 
to  model  structure  and  its  projection  in 
reality.     It  must  be  possible  to  sit  down 
at  a  console  and,  during  a  morning,  get 
solutions  to  several  variants  of  a  model, 
with  human  interaction  with  the  computer 
quickly  and  cryptically  communicated.  This 
means  interactive  systems  and,    in  most 
situations--certainly  ours — it  means 
reliable  telecommunication  facilities. 
Only  then  can  we  begin  to  get  some  feel 
for  the  possible  behavior  of  our  compli- 
cated world. 

Thus  the  computer,   used  as  an  analy- 
tical tool--in  fact,   almost  as  a 
colleague--will  be  not  only  a  computational 
engine  but  a  unifying  and  standardizing 
force.     There  will  still  be  plenty  of 
differences  of  opinions  but  the  issues  will 
be  clearer  and  facts--as  nearly  as  we  can 
approximate  them--will  stand  out. 

But  all  this  will  happen  only  if  those 
of  us  who  are  skilled  in  and  motivated  by 
the  effective  use  of  computers  begin  to 
assume  leadership.     The  world  is  buried  in 
scientific  and  technical  journals  and  there 
is  no  end  to  ever-multiplying  complexity 
of  thought  and  confusion  of  detail.     It  is 
almost  a  sickness.     The  need  now  is  to  take 
hold  of  all  our  skills,   tools,  and  proven 
techniques  and  mobilize  them  to  the  best 
of  our  ability  in  order  to  clarify  issues, 
influence  meaningful  decisions,  promote 
rational  cooperation,   and  continually 
sharpen  our  perceptions.     We  have  nothing 
to  lose  and  we  might  just  discover  a 
world  order  that  is  workable. 
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IMPLEMENTATION  AND  APPLICATION 
OF  A  NESTED  DECOMPOSITION  ALGORITPIM* 

James  K.  Ho 

Applied  Mathematics  Department 
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ABSTRACT 

This  paper  considers  a  nested  decomposition  algorithm  for  multi-stage  linear 
programs  with  the  staircase  structure.     Computational  aspects  of  the  algorithm  essential 
to  an  efficient  implementation  are  discussed.     Experience  with  using  experimental  codes 
of  the  algorithm  on  problems  arising  from  real  applications,    such  as  energy  models, 
engineering  design,   and  dynamic  traffic  control  is  presented.     It  is  observed  that  nested 
decomposition  can  be  an  efficient  technique  for  large-scale  systems. 


*Work  performed  under  the  auspices  of  the  ERDA. 


1 .     The  Staircase  Algorithm 

We  consider  the  linear  programming 
problem  of  minimizing 
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The  various  terms  in  the  subproblems 
are  defined  recursively  as  follows,  pre- 
ceded by  their  dimensions.     A  prime  denotes 
a  transpose. 


In  L 3]   and  [7],    the  Staircase  algo- 
rithm is  developed  from  an  application  of 
nested  decomposition  to   (1).     Using  this 
algorithm,   the  original  problem  is  re- 
placed by  a  sequence  of  smaller,  indepen- 
dent subproblems  coordinated  by  primal 
(proposals)   and  dual   (prices)  information 
in  the  sense  of  Dantzig  and  Wolfe  [2]. 
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A  cycle  of  the  Staircase  algorithm, 
indexed  by  k,   consists  of  T  subproblems 

denoted  by  SP^,    t  =  1,...,   T,   and  defined 
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scalar:     6!=  =  if   (xl^  ,     x1  .)   is  ^       <=easible  3„,^^^„> 

homogeneous 


1  X  k:     6^  =  (1, 


1  X  m^:  -1 


0'  '"t-1,  -t-i 
(t  =  2   T) 

,k-l 


■  1  a2 
-f  't' 


.  (t 


T) 


►  (8) 


scalar 


^.       J.     is  the  vector  of  dual  variables  tor  (3). 
k 


is  the  dual  variable  for  (4). 


SP^  is  read  as  subproblem  t  at  cycle  k. 

SP^  denotes  subproblem  t  at  an  unspecified 

k  k 

cycle.     The   (1  +  m^)    x   1  vector    (P^/  q^) 

k  k 
is  called  a  proposal  from  SP^  ^  to  SP^. 

The  prices  for  SP^  are  given  by 
k 


(n 


t+1'  "t+1^ 


The  three  phases  of  the  Staircase 
algorithm  are  summarized  below. 

Phase  1: 


unbounded.     If  t  =  1,   go  to 
Step   (v) . 

Step     (iv):     Set  t  =  t  -  1.     Return  to 
Step   (iii) . 

Step        (v) :      Test  for  optimality: 
k 

z     =     E     Ti     d   .     If  optimal, 

^       t=l     t  ^ 
go  to  Phase  3.  Otherwise, 
set  k  =  k  +   1;   go  to  Step  (ii). 

Phase  3^ : 


Step        (i):      Set  t  =  T,    Yrj,  -        '  compute 

d  -Ay. 
T         T  T 


Step      (ii):      Set  t  =  t  -  1.      Solve  SY^. 

If  t  =  1,    stop.  Otherwise 
compute  d^  -         y^.  Return 

to  Step    (ii) . 

A  flow  diagram  of  the  algorithm  is 
-given  in  Figure  1. 


Step  (i) 
Step  (ii) 


Phase  2; 


Step  (i) 


Set  t 


Start  with  an  artificial  basis 
for  SP^.     Set  the  objective 

to  be  the  sum  of  the  infeasi- 


bilities   in  SP 


Use  the 


Phase  2  procedure  to  solve 
the  subsystem  [ SP^ , . . . , SP^l . 

If  the  optimal  value  of  the 
objective  >  0,   stop:  problem 
is  infeasible.     Otherwise,  go 
to  Step    (iii)    if  t  <  T;   go  to 
Phase  2   if  t  =  T. 


Step   (iii):     Generate  a  proposal  for  SP 


t+1' 


Set  t  =  t  +   1.     Go  to  Step  (ii). 


Set  k  =  1, 


2 .  Implementation 

The  computational  efficiency  of  an 
implementation  of  the  Staircase  algorithm 
depends  essentially  on  the  following  three 
aspects : 

(i)    data  structure:     This  pre- 
scribes the  amount  of  data 
required  to  define  a  subprob- 
lem,   and  the  amount  of  data 
transfer  required  to  update 
a  subproblem. 

(ii)    solving  a  subproblem:  This 

is  how  the  subproblems  are  to 
be  solved  as   linear  programs. 

(iii)    coordinating  information: 

This  is  how  prices  are  incor- 
porated and  how  proposals  are 
generated  in  a  subproblem. 


Step      (ii) :      Set  t  =  T. 

k 


Step  (iii] 


Solve  SP. 


If  t  <  T,   send  a 
k+1 


proposal   (if  any)    to  SP^^-j^. 

If  t  >  1  and  subproblem  is 

bounded,    send  prices   to  SP^  ^, 

otherwise  go  to  Step   (ii) ;  if 
t  =  T  and  subproblem  is  un- 
bounded,   stop:     problem  is 


^Define  for  t=T-l   1. 

minimize      "^t^t  *  ^t^t 

SY  : 

subject  to  h^y^  +  Q^w^.  = 


-  ''t+l^t+l 


*t^t  =  ^ 


For  SY^  delete  terms  involving  li^. 
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For  our  experimental  codes,   we  adopted 
the  guide-line  of  making  full  use  of  ad- 
vance simplex  techniques   in  linear  program- 
ming for  solving  the  subproblems   (see  e.g. 
[11],    [141).     Efficient  designs  for   (i)  and 
(iii)    compatible  with  such  a  scheme  are 
then  identified.     Although  we  have  been 
limited  by  the  scope  of  this  research  to 
give  certain  considerations  to  convenience 
of  programming,   the  underlying  concepts  are 
nonetheless  general  and  should  be  applica- 
ble to  even  the  most  sophisticated  imple- 
mentation . 


The  amount  of  basis  data  that  needs  to  be 
stored  depends  on  the  subproblem  solution 
procedure.     In  our  case,   these  are  simply 
index  vectors   identifying  a  basis.  Class 
(iv)    consists  of  data  for  the  current  con- 
straint matrix,   while   (v)    contains  data 
for  the  new  columns  to  be  added  to  form  a 
new  subproblem  in  the  following  cycle. 
The  necessity  to  differentiate  between  (iv) 
and   (v)   depends  on  the  relative  efficiency 
in  concatenating  data  records,   which  is  in 
turn  determined  by  the  data  storage  device 
and  mode  of  data  transfer  being  used. 


3 .     Data  Structure 

Large  scale  problems  are  usually  very 
sparse.     The  density  of  nonzero  entries  in 
the  constraint  matrix  is  typically  less 
than  one  percent  [1^.     Although  the  nonzero 
entries  concentrate  in  blocks  for  staircase 
structures,   the  subproblems  defined  on 
these  blocks  should  still  be  sparse.  For 
example  an  0.1%  density  in  a  10-period 
1,000  X   10,000   (row  by  column)   problem  im- 
plies an  average  density  of  0.526%  for  the, 
non-zero  blocks.     Therefore,    the  subproblem 
data  should  be  stored  in  packed  form.  Many 
schemes  are  available  for  storing  only  the 
nonzero  elements   (see  e.g.   [8],  [12l). 
Since  our  algorithm  consists  mainly  of  col- 
umn operations,   we  use  a  column  packing 
scheme.     The  nonzeros  are  packed  in  a  vec- 
tor  (one-dimensional  array) ,   by  column  or- 
der.    A  second  vector  of  the  same  length 
contains  the  row  indices  of  the  correspond- 
ing entries  in  the  first  vector.     A  third 
vector  gives  for  each  column  the  position 
of  its  first  entry  in  the  other  two  vec- 
tors.    In  the  Staircase  algorithm,   the  sub- 
problems  are  modified  by  the  addition  of 
proposals,   which  form  new  columns  in  the 
constraint  matrices.     With  the  above  scheme, 
appending  a  new  column  is  done  by  simple 
extension  of  the  three  vectors. 

In  general,   the  data  for  each  subprob- 
lem can  be  classified  as  follows: 

(i)   constraint  type  data, 
(ii)   right-hand-side  data, 
(iii)   basis  data, 
(iv)   constraint  matrix  data, 

(v)   proposal  data. 

The  first  two  classes  remain  unchanged 
from  one  cycle  to  the  next  during  Phase  1 
and  Phase  2.     They  are  modified  in  Phase  3. 


Subproblem  data  are  stored  out  of 
core.     Regions  in  out-of-core  memory  are 
designated  for  the  various  classes  of  data. 
Each  region  is  subdivided  into  T  subre- 
gions ,   corresponding  to  the  number  of  pe- 
riods  in  the  problem.     To  solve  a  subprob- 
lem,   data  from  the  appropriate  subregions 
are  read  into  the  work  region  (scratch 
space)    in  core.     Outputs  from  the  subprob- 
lem are  written  into  the  appropriate  sub- 
regions  as  data  for  later  cycles.  The 
data  flow  for  a  subproblem  is  illustrated 
in  Figure  2. 

4.  Solving  a  Subproblem 

To  solve  the  subproblems  efficiently 
as  linear  programs,   we  use  a  product-form- 
inverse   (PFI)   revised  simplex  routine  with 
an  inversion  subroutine  designed  to  pro- 
duce sparse  representations  of  the  basis 
inverse.     This  routine  is  based  on  LPMl, 
an  in-core  LP  code  written  by  J.  Tomlin  in 
1970  [14]. 

5 .  Coordinating  Information 

The  typical  subproblem  can  be  arranged 
in  the  form  shown  in  Figure  3  where  **  de- 
notes a  non-binding  row.     No  pivot  opera- 
tion is  to  be  performed  on  such  a  row. 
Apart  from  being  consistent  with  the  data 
structure  described  in  Section  3,   this  full 
subproblem  formulation  has   the  advantage 
of  allowing   (a)   dual  prices  to  be  incor- 
porated into  the  objective  function  with- 
out any  change  in  original  data  and  (b) 
primal  proposals  to  be  inferred  from  the 
updated  right-hand-sides  of  the  non-binding 
rows.     To  show  this,  we  use  the  simplifying 
notation  in  Table  1. 

Now,    suppose  we  maintain  a  basis  for 
the  full  subproblem  including  the  non- 
binding  rows.     Such  a  basis  will  have  the 
following  form: 
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A 

M  = 


1 

A 

c 

0 

A 

0 

A 

0 

A 

0 

a 

I 

(9) 


where  the  entry  1  and  the   identity  sub- 
matrix  I  indicate  that  the  slacks  of  the 
non-binding  constraints  are  always  basic 
since  no  pivoting  is  done  on  such  rows. 
The  basis  inverse  is  then 


Suppose  we  now  require  ^  to  satisfy 
TTM  =    (1,    0,    -r^^^j_)  (12) 


so  thaf 


t+1 


m 


components 


M 


'-1 


AA-1 

1 

-cA 

0 

0 

0 

AA-1 

.0 

-BA 

I  . 

(10) 


The  objective  function  for  SP^,    as  given 

by  (2)  is  (dropping  the  superscript  k  for 
convenience) , 


TT  =    (1,  0, 


(1,   0,  -n^^^) 


1 
0 
0 


AA-1 

-cA  0 


A-1 

A 

AA-1 

-BA 


0 
I 


A  A,  A-1  . 

=    (l--{c  -  n^^iB)A     ,   -n^^^)  . 


which  is  to  be  minimized.  In  the  present 
notation , 


(11) 


We  argue  that  it  is  not  desirable  to  set 
up         explicitly.     First,    a  direct  multi- 
plication of  ^^j^Y  ^'^'^  ^  would  make  it  nec- 
essary to  distinguish  between  data  in  A 
and  B.     This  is  cumbersome  since  the  ma- 
trix data  are  packed  column  by  column. 
Secondly,    to  update  the  objective,   we  need 
to  store  the  original  c  separately  since 
this  is  used  every  time  we  compute  a  new 


Finally,    updating  requires   the  accom- 


modation of  new  nonzeros .     This  cannot  be 
done  efficiently  in  the  data  structure 
being  used.     All  these  complications  can 
be  circumvented  as   follows.     Recall  that 
without  the  t'^_|_-j^B  term  in  the  objective, 

the  simplex  multiplier  n  for  the  basis  M 
would  satisfy 

Tti  components 
TT&  =  (1,    0,   0) 

This  gives  a  reduced  cost  of  1  to  s^,  the 

slack  variable  in  the  objective  row  which 
is  always  basic  since  the  objective  row  is 
nonbinding,   and  zero  reduced  costs  to  the 
other  basic  variables. 


Letting        =    (c  -  n,    -,^)A     ,   we  have 
t  t+  i 

n  =    (1,  ~^^-_)_]_)   ^'^'^  the  reduced  costs 

are 

—  A  A, A-1 

c  =  TTM  =  c  -   (c  -  n^_j_^B)A    A  -  "t+iB 


=    (c  -  n^^^B)   -  TT^A 

so  that  we  are  effectively  using  z 


(14) 


t 


Note  that   (12)    implies  that  the  re- 
duced costs  of  the  slacks  variables   in  the 
last  "^^^.-i^  rows  are  precisely  ~"^_(.-[^-  These 

variables  are  always  basic  since  the  cor- 
responding rows  are  nonbinding.  Therefore, 
we  can  interpret   (12)    as  the  setting  of 
prices  on  the  inputs  and  outputs  described 
by  the  matrix  B. 

Next,  consider  the  updated  right-hand- 
side  given  by  (15)  where  x  gives  the  values 
of  the  basic  variables   in  the  current  ex- 

k 


treme  point  solution  x  of  SP 


The  non- 


basic  variables  in  x  are,  of  course,  at 
zero  value. 


^This  is  the  backward  transformation  or 
BTRAN  step  in  the  revised  simplex  method. 
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3  = 


L'3 


=  M     d  = 


r  AA-1 

-cA 


M-^  d. 


AA 

-ex 


A 
X 


AA 

-BX 


(15) 

Therefore,  according  to  (5),  (5),  and  (15) 
a  proposal  corresponding  to  x  is  given  by 


(16) 


'p' 

'cx" 

_AA  _ 

CX 

q 

Bx 

,\A 

BX 

which,  apart  from  the  sign,  is  part  of  the 
updated  right-hand-side  S. 

Proposals  corresponding  to  extreme 
ray  solutions  of  SP^  can  be  generated  as 
follows.     Suppose  a  column  in  M,  say 


N  = 


B 


is  priced  out  to  be  the  pivot  column.  Its 
transformation'*'   with  respect  to  the  basis 


-s   _  A-1     s       „       ,  k  . 

If  A     =  A       A     '  0,    then  SP^  is 

unbounded  from  below,  with 

1,    0   0,    (-A^).,    0   0) 


=  (0, 


0, 
th 


0, 
.th 


(IS) 


I 


position: j  position 
where 

th 


basic  in 
row 


■p' 

•    s         ~  s  n 

c     -  CA 

r-    3          AA-1  S" 

C     -  CA  A 

q 

3         ■  -3 
.B     -   BA  , 

3         AA -I  s 
,B     -  BA     A  . 

>3. 

A  . 

M  IS 


as  a  homogeneous  solution  on  the  extreme 

ray   (cf.   Theorem  2.1,   p.   35  in  [13]).  By 

definition,   the  extreme  ray  is  simply 

X,  /(e-x,  )   where  e=    (1,...,    1).  Observe 
n  h 

that  by  (5),  (6),  (17)  and  (18),  the  pro- 
posal corresponding  to  x.^  is 


(19) 


which  is  part  of  the  transformed  column  N. 
The  proposal  corresponding  to  the  extreme 
ray  can,   of  course,   be  obtained  from  (19) 
by  scaling  with  l/{e-x^).     However,    this  is 

not  necessary  and  we  may  simply  send  the 

k+ 1 

proposal  in  (19)  to  SP^_^^.  This  is  equiva- 
lent to  an  implicit  scaling  of  the  corre- 

k+ 1 

spending  \  variable  in  SP     .   by  e-x,  ,  hence 

t+1  h 

there  is  no  loss  of  generality. 

Finally,    to  determine  whether  a  pro- 

k+ 1 

posal   is  profitable  to  according  to 

the  prices  ''^.(-^■l'         have  to  test  whether^ 


N  = 


L^3J 


A-1 

M    N  = 


AA-1 

1 

-CA 

A-1 

0 

A 

AA-1 

.  0 

-BA 

0 
0 

IJ 


LB 


z     -  a     ,   <  0 
t  t+1 


for  extreme  point  proposals 
<  0 


,for  extreme  ray  proposals, 


c 


LB  - 


AA-1  S 

CA  A 

A-1  S 

A  A 

AA-1  ,S 
BA  A 


(17) 


This  second  case  is  always  satisfied  since 
is  unbounded  from  below.     Hence,  no 

computation  is  required.     For  the  first 
case  we  compute 


*This  is  the  forward  transformation  or 
FTRAN  step  in  the  revised  simplex  method. 


^ In  the  Staircase  algorithm,    "proposal"  and 
"profitable  proposal"  are  used  synonymous- 
ly,   so  that  strictly  speaking,   the  vector 
defined  in   (16)    is  a  proposal  only  if  it 
passes  the  test. 
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AA 


t+1 


=  -L9. 


(20) 


which  is  essentially  the  evaluation  of  an 
inner  product. 

We  have  shown  that  as  the  result  of 
applying  the  revised  simplex  procedure  to 
the  full  subproblem 


minimize 


cx 


subject  to     L   Jx  ^  d 


X  s  0 


(21) 


(a)  the  dual  information  tt^_|_-|^  is  efficient- 
ly utilized  as  part  of  the  prices  n 
for   (21) ; 

(b)  the  primal  information  [^]   is  effi- 

q 

ciently  generated,   being  a  by-product 
of  right-hand-side  updating  in  the 
case  of  extreme  points,   and  forward 
transformation  of  a  pivot  column  in 
the  case  of  extreme  rays. 


6 .     Ref  inements 

Within  the  basic  framework  summarized 
in  Section  1,   many  refinements  of  the 
Staircase  algorithm  are  possible.  Such 
modifications  are  called  computational 
strategies.     Some  are  designed  to  acceler- 
ate convergence;   others  to  reduce  storage 
requirements  and  the  amount  of  data  trans- 
mission.    Most  of  them  are  motivated  by 
heuristics  and  have  to  be  validated  empir- 
ically.    Moreover,   good  strategies   for  one 
class  of  problems  may  turn  out  to  be  poor 
ones   for  another.     Therefore,    it  is  im- 
portant to  parametrize  such  refinements  so 
that  a  process  of  fine  tuning  is  possible 
for  any  given  class  of  problems.     We  iden- 
tify three  important  strategies  here.  For 
more  detail,    the  reader  is  referred  to  [3]. 

(a)   Degree  of  Decomposition: 

For  a  given  amount  of  core  storage, 
it  is  often  feasible  to  vary  the  number  of 
subproblems  by  grouping  two  or  more  as  a 
single  stage.     A  higher  degree  of  decompo- 
sition gives  smaller  subproblems  which  are 
easier  to  solve,   but  is  likely  to  require 
more  interactive  adjustments  before 


obtaining  an  optimal  coordination  among 
the  subproblems.     Initial  experience  with 
the  Staircase  algorithm  l3]   suggests  that 
whenever  nested  decomposition  algorithms 
are  intended  for  routine  applications  on  a 
particular  class  of  problems  with  a  fixed 
amount  of  core  storage,   one  should  prede- 
termine empirically  a  good  strategy  for  the 
degree  of  decomposition. 

(b)  Multi-proposal  Generation: 

By  generating   (when  possible)  more 
than  one  proposal  from  each  subproblem, 
convergence  may  be  accelerated.  However, 
if  too  many  proposals  are  transmitted  in  a 
cycle,   the  inferior  ones,   which  may  never 
become  useful,   simply  cause  an  unnecessary 
increase  in  the  size  of  a  subproblem. 
Moreover,   proposals  that  are  too  similar 
may  give  rise  to  numerical  instability. 
Therefore,    a  heuristic  procedure  is  re- 
quired to  select  a   limited  number  of  pro- 
posals.    A  limit  of  five  provided  good  re- 
sults  in  most  cases  we  encountered. 

(c)  Proposal  Purging: 

As  proposals  are  introduced,   the  grow- 
ing size  of  a  subproblem  may  cause  diffi- 
culties for  in-core  storage.     As  far  as  the 

optimization  of  SP^  is  concerned,    the  only 

proposals  that  need  be  kept  are  the  cur- 
rently basic  ones    (for  feasibility)   and  the 
latest  profitable  ones    (for  improvement). 
All  others  could  be  dropped  as  they  will 
be  generated  again  if  and  when  they  become 
profitable  on  later  cycles.     we  use  a 
scheme  to  purge  as  many  non-basic  proposals 
as  necessary  to  keep  the  subproblem  size 
within  limits  determined  by  core  availabil- 
ity.    However,    a  modification  of  Phase  3 
is  also  required  to  allow  for  proposal 
purging . 


7 .     The  Experimental  Codes 

Two  experimental  codes,    named  SC73  and 
SC74  have  been  written  in  FORTRAN.  Input 
data  is  in  standard  MPS   format  plus  a  sec- 
tion on  information  characterizing  a  pat- 
tern of  decomposition,    except  for  the  data 
handling  features.     The  two  codes  are  iden- 
tical. 

SC73  runs  on  an  IBM  360/91  at  the 
Stanford  Linear  Acceleration  Center.  It 
requires  approximately  200K  bytes  of  core 
storage  when  dimensioned  for  problems  with 
a  maximum  of  20  periods,    each  having  up  to 
500  rows    (as  a  full  subproblem)    and  3000 
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nonzero  coefficients    (including  proposals). 
Secondary  storage  is  on  220  tracks  (one 
track  =  7294  bytes)   of  IBM  type  2314  mag- 
netic disk.     Data  transmission  is  by  di- 
rect access  I/O  using  variables   length  re- 
cords with  a  block  size  of  7294  bytes. 

SC74  runs  on  the  CDC  6000  and  7000 
series  computers.     It  requires  approxi- 
mately 35K  words  of  SCM  core  storage  when 
dimensioned  for  problems  with  a  maximum  of 
10  periods,    each  having  up  to  500  rows  (as 
a  full  subproblem)    and  6000  nonzero  co- 
efficients   (including  proposals).  Second- 
ary storage  requires   170K  words  of  ECS 
(extended  core  storage)    or  LCM   (large  core 
memory) .     Block  transfer  of  data  between 
SCM  and  LCM  is  used. 

For  comparison  with  a  direct  simplex 
approach,   we  used  MPS/360  and  LPMl  [14], 
the  latter  being  the  simplex  procedure 
used  in  SC73  and  SC74.     Roughly  the  same 
storage  configuration  is  used  in  each 
comparative  run. 

8 .     Experience  with  Applications 

This  section  presents  computational 
experience  with  the  successful  application 
of  the  Staircase  algorithm  to  three  classes 
of  multi-stage  linear  programs.     In  each 
case,   we  describe  the  nature  of  the  pro- 
blem briefly  and  summarize  the  performance 
of  the  Staircase  algorithm  as  compared  to 
a  direct  simplex  approach. 

(a)   Optimal  Design  of  Multi-stage 
Structures  [  4]  : 

The  problem  is  to  design  multi-stage 
planar  trusses  for  minimal  weight  over  a 
class  of  feasible  member  sizes  and  config- 
urations.    The  design  is  based  on  limit 
analysis  subject  to  a  single  set  of  loads. 
The  variables  x^  represent  forces   in  mem- 
bers of  stage  t  in  the  truss.     The  equi- 
librium conditions  for  stage  t  are  ex- 
pressed  through        while  B^  represents  the 

coupling  between  stage  t  and  t  +   1.  The 
external  loads  are  given  in  d^.  Typically, 

these  problems  have  a  large  number  of  col- 
umns in  proportion  to  the  number  of  con- 
straints.    Therefore,   the  efficiency  of 
the  Staircase  algorithm  could  be  due  part- 
ly to  a  partial-pricing  effect.  See 
Table  2. 


(b)  Dynamic  energy  model  [5]: 

These  test  problems  are  derived  from 
a  staircase  version  of  Manne's  model  of 
U.S.   options  for  a  transition  from  oil  and 
gas  to  synthetic  fuels  [9].     They  seek 
minimum-cost  strategies  to  meet  future 
energy  demands  under  a  series  of  alterna- 
tive scenarios.     The  latter  depends  on 
estimates  of  the  remaining  quantities  of 
domestic  oil  and  gas  resources,   and  the 
technical  and  environmental  feasibility  of 
new  methods  for  synthetic  fuel  production. 
The  variables  are  production  capacities 
and  investment  in  the  energy  sectors.  The 
single-period  constraints  exert  bounds  on 
new  capacity  introduction  rates  and  relate 
production  to  final  demands   in  energy  out- 
put.    The  dynamic  constraints  relate  capac- 
ity inventories  and  investment,   model  the 
nuclear  cycle,   and  exert  bounds  on  cumula- 
tive resource  extraction. 

The  model  has   16  periods  representing 
five-year   intervals   from  1970  to  2045. 
However,   a  four-period  decomposition  is 
used  for  reasons  explained  in  section  6a. 
All  new  technologies  are  allowed  in  pro- 
blem 4A   (see  Table  3)   while  most  of  them 
are  suppressed  in  problem  2A.     The  per- 
formance of  the  direct  simplex  approach 
(LPMl)    reflects  this  variation  in  complex- 
ity.    Whereas,   by  decomposition,  such 
effects  are  "felt"  by  each  subproblem  from 
the  start.     This  may  explain  why  SC74  took 
roughly  the  same  amount  of  time  for  all 
four  problems. 

(c)  Dynamic  Traffic  Assignment  [6]: 

In  the  Merchant  and  Nemhauser  model 
of  dynamic  traffic  assignment  [10],  a 
traffic  network  is  represented  by  a  di- 
rected graph.     One  of  the  nodes   is  desig- 
nated as   the  destination.     The  planning 
horizon  is  divided  into  a  finite  number  of 
discrete  time  periods.     For  each  time  per- 
iod,  external  inputs  are  allowed  at  any 
node  except  the  destination.     For  each  arc, 
there  is  an  exit  function  which  relates  the 
amount  of  traffic  entering  and  leaving  the 
arc  during  a  time  period.     Congestion  is 
modeled  by  assuming  the  exit  functions  to 
be  nondecreas ing ,   continuous,  piecewise 
linear  and  concave.     The  problem  is  to  find 
the  feasible  traffic  flow  which  minimizes 
the  total  amount  of  traffic  over  the  plan- 
ning horizon. 

Here  the  unknowns  are  the  amount  of 
traffic  in  each  arc  in  each  time  period. 
They  are  transformed  to  convex  combinations 
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of  the  grid  points   for  the  piecewise 
linear  exit  functions.     Therefore,  the 
variables         are  the  interpolation  weights 

for  these  combinations.     The  single-period 
constraints  are  the  flow  balance  equations 
for  the  nodes.     The  dynamic  constraints 
are  the  flow  balance  equations  for  the 
arcs  . 

The  Staircase  algorithm  is  used  as 
part  of  a  hybrid  algorithm  [6]  for  this 
class  of  problems.     See  Table  4. 


Remarks 


It  has  been  observed  in  [3]   that  re- 
lative to  a  direct  simplex  approach,  the 
Staircase  algorithm  tends  to  become  more 
efficient  with  increasing  problem  size. 
However,    the  threshold  problem  size  dif- 
fers considerably  for  different  classes  of 
staircase  problems.     The  results  presented 
here,    though  favoring  decomposition  in 
every  case,    simply  substantiate  that  obser- 
vation.    They  should  not  be  interpreted  as 
measures  of  the  relative  performance  of 
the  two  approaches   in  terms  of  absolute 
problem  size. 
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Table  2.     Statistics  of  the  Structural  Design  Problems 

Fig.    3.     A  Full  Subproblem 
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ABSTRACT 

This  paper  presents  a  cutting  plane  method  for 
integer  programming.     Let  ag  be  the  optimal  point 
of  the  associated  linear  program  and  n  be  the  nor- 
mal to  the  objective  function  hyperplane  at  ag. 
On  the  half lines  of  the  cone  incident  at  ag,  we 
determine  those  points  where  the  halflines  inter- 
sect the  coordinate  planes  and  then  project  them 
onto  n.     With  the  projections  as  stepping-stones, 
the  objective  function  hyperplane   (the  parallel 
cut)    is     pushed     into  the  cone  step  by  step.  At 
each  step,  a  minimum  number  of  integer  points  is 
generated  for  feasibility  and  optimality  tests. 
The     first     feasible  integer  point  'trapped'  by 
the  parallel  cut  is  the  optimal  point  of  the  inte- 
ger program. 

Other  main  features  of  this  method  are : 
(1)    In  practice,   it  is  not  necessary  to  compute 
the  cut.      (2)    In  general,   the  candidates  gene- 
rated at  each  step  give  better  value  to  the  objec- 
tive function  than  the  candidates  generated  at  the 
next  step.      (3)   Reoptimization  is  not  required. 

Keywords:   Integer  programming,   cutting  plane , 
cone,  normal,  projections. 


1.  INTRODUCTION 

This  paper  presents  a  cutting  plane  method 
for  integer  programming.     The  cut  is  always 
parallel  to  the  hyperplane  of  the  objective  func- 
tion (called  a  parallel  cut)  .     Let  ag  be  the  opti- 
mal point  of  the  associated  linear  program  and  n 
be  the  normal  to  the  objective  function  hyperplane 
at  ag .     On  the  halflines  of  the  cone  incident  at 
ag,  we  determine  those  points  where  the  halflines 
intersect  the  coordinate  planes .     The  intersection 
points  are  projected  onto  n.     These  projections 
are  then  used  as  stepping-stones  for  moving  the 
parallel  cut  into  the  cone  step  by  step.     At  each 
step,  the  intersections  on  the  halflines  are  used 
to  generate  a  set  of  integer  points  for  feasibility 
and  optimality  tests.     The  process  terminates  as 
soon  as  a  feasible  integer  point  is   'trapped'  by 
the  parallel  cut. 

Our  parallel-cut  method  is  closely  related  to 
three  other  approaches  developed  in  the  last  few 
years,  namely  the  methods  of  convexity  cuts  [1,2, 
3,8,11,12],  enumerative  cuts   [4,5,6],  and  cut- 


search  [7,8,9].     However,  these  approaches  have 
some  or  all  of  the  following  disadvantages: 

1.  There  exist  no  criteria  for  the  choice  of 
convex  regions    (in  the  case  of  convexity  cuts  and 
enumerative  cuts)   or  for  the  choice  of  halflines 
(in  the  case  of  cut-search) . 

2.  Extra  computational  work  is  required  to 
generate  the  cuts.     The  amount  of  work  depends 
mainly  on  the  convex  region  used. 

3.  A  cut  may  be  either  very  shallow  or  in  the 
wrong  'inclination',   i.e.  too  many  bad  candidates 
may  be  generated. 

4.  Reoptimization  is  required. 

5.  Usually,   even  after  a  feasible  integer 
point  has  been   'trapped'  by  a  cut,  search  for  a 
better  solution  still  has  to  be  continued. 

In  contrast,  our  parallel-cut  algorithm  has 
the  following  main  features: 

1.  No  choice  of  convex  regions  or  edges  is 
required.     A  parallel  cut  is  generated  step  by  step 
in  a  definite  manner. 

2.  A  parallel  cut  is  merely  a  conceptual  cut. 
In  practice,  no  computation  is  involved  in  its  gen- 
eration . 

3.  The  direction  of  a  parallel  cut  is  steepest 
and  hence  may  be  regarded  as  the  proper  'inclina- 
tion'.    Together  with  stepsize  control,  a  parallel 
cut  generates,  at  each  step,  a  minimum  number  of 
good  candidates. 

4.  Reoptimization  is  not  required. 

5.  Once  a  feasible  integer  point  is  'trapped' 
by  a  cut,   it  is  optimal  and  the  process  terminates. 

Parallel  cuts  are  also  used,  conceptually,  in 
Hillier's  bound-and-scan  algorithm  [10],  but  in  a 
different  manner. 


2.    GEOMETRIC  INTERPRETATION  OF  THE  PARALLEL-CUT 
METHOD 

In  this  section,  we  illustrate  by  an  example 
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the  geometrical  motivation  of  our  parallel-cut 
method.     For  expository  purposes,  we  shall  use  the 
following  definitions. 

Definition     (0-plane ,  0(x*) -plane)       An  0-plane  is 
a  hyperplane  of  the  form  cx=a ,  where  cx  is  the 
objective  function  and  a  is  a  constant.     In  parti- 
cular, an  0{x*) -plane  has  the  form  cx=cx* ,  where 
X*  is  a  fixed  point  on  the  plane. 

In  Figure  1,  Cj  and        are  two  halflines 
incident  at        and  n  is  the   (negative)   normal  to 
the  O(aQ) -plane  at  a^.     We  observe  that  the  coor- 
dinate planes  through  an  integer  point  inside  the 
cone  intersects  at  least  one  of  the  halflines.  For 
example,  the  coordinate  planes  through  I^  intersect 


Cj  and  C, 
the  coord 


n  1 

at  H,  and  Hii  respectively, 
inates  of         (and  H^)  must 


Hence,  one  of 
be  integral . 


Definition  (H-point)  An  H-point  is  a  point  on  a 
halfline  incident  at  ag  such  that  at  least  one  of 
its  coordinates  has  integral  value. 

Starting  from  a^,  let  us  move  the  0-plane  in 
the  direction  of  n.  As  it  moves  forward,  it  will 
'trap'  infinitely  many  integer  points,  some  lying 
outside  the  cone  (e.g.  Ij)»  some  lying  inside  the 
cone  but  violating  certain  nonbinding  constraints 


(here  1 3)    'trapped'  is  obviously  the  optimal  point. 
Note  that  if  these  integer  points  are  projected 
perpendicularly  onto  n,  the  image  of  the  optimal 
integer  point  has  the  shortest  distance  from  a^ 
among  all  feasible  integer  points. 


Using  the  projections  PgfPj/Pj'Ps'  Pl,  arid  p^ 
as  stepping-stones.  Table  1  shows  the  H-points, 
integer  coordinates  and  integer  points  newly  gene- 
rated at  each  step.     Note  that,  at  p^,  1 3  becomes 


optimal,  because  it  is  the  first 
point  'trapped'. 


feasible  integer 


Pi 


H-points 


integer 
coordinates 


integer  points 


Po 

P2 
P3 


none 

Hi 

Hi 


Hi 


Xj  =  2 


X2=l 


x  j  =  3 


none 
none 
none 


I2   (infeasible) , 
1 2    (curr.  optimal) 
I^   (rejected,  worse 


than  I 


Hi 


Table  1 

Information  newly  generated  at  each  step  in  Fig.  1. 


(e.g. 


The     first    feasible  integer  point 


Figure  1    Illustration  of  the  parallel-cut  method. 


32 


PROBLEM  FORMULATION  AND  MATHEMATICAL 
PRELIMINARIES 

Consider  the  pure  integer  linear  program 


maximize  cx 
0:  S subject  to  Ax<b 

[  x-L>0 ,  integer,  ieN, 


(3.1) 


where  N={ 1 , 2 , . . . ,n } ,  c,  x  are  n-vectors ,  A  is  an 
itKn  matrix  and  b  is  an  m-vector.     Throughout  this 


paper , 
ables . 


we  refer  to  x; ,  ieN,  as  the  structural  vari- 


The  associated  linear  program  Q'  is  obtained 
from  the  integer  linear  program  Q  by  dropping  the 
integrality  requirement.     Introducing  m  slack  vari- 


ables s=col (Sj (S^ . . 
standard  form 


,  Sjv,)  ,  Q'  can  be  written  in  the 


■  3 


basic  structural  variables 
basic  slack  variables 


is  obtained  from  the  optimal  simplex  tableau  of  Q" , 
where        and  Y3  correspond  to  the  columns  of  the 
current  optimal  point  and  the  nonbasic  slack  vari- 
ables, whereas        and  Y^^  correspond  to  the  columns 
of  the  nonbasic  structural  variables.     Then  the 
column  vectors  a^  of  (3.4)   are  the  columns  of  the 
following  nx(n+l)  matrix 


basic  structural  variables 
nonbasic  structural  variables, 


where  I  is  an  identity  matrix. 


Furthermore,  suppose  ygj'  jsN,  are  the  rela- 
tive-cost factors  obtained  from  the  optimal  simplex 
tableau.  Then 


(maximize  ^ 
Q":  ■<s\±)ject  to      Ax+s=b  (3.2) 
[  x^>Q,  ieN,  sj>0,  jeM, 

where  M={l,2,...,m}. 

Suppose  Q"  has  an  optimal  solution.     Let  B 
and  J  be  the  index  sets  of  the  basic  structural 
and  basic  slack  variables  respectively  and  y^j  be 
the  coefficients  of  the  optimal  simplex  tableau. 
Then  the  objective  function  value  and  each  of  the 
basic  structural  variables  can  be  expressed  in 
terms  of  the  nonbasic   (structural  and  slack)  vari- 
ables tj  as 


jsN^Oj  j 

y.^t. ,  i£B. 


V^iO  SeN'ij^j' 


(3.3) 


Let  us  replace  every  y^^^  in  (3.3)  by  another 
notation  sl^-^,  for  isB  and  jeiOluN.     Attaching,  for 
each  nonbasic  structural  variable  Xj^,  the  trivial 
relation  setting  Xj^  equal  to  itself,  i.e.   a  rela- 
tion of  the  form  (3.3)   in  which  a^Q=0 ,  a^^=-l  and 
a^-=0  for  1^=  j  ,  ieN-B,  and  rewriting  in  vector 
notation,  we  obtain  the  following  linear  program 
over  a  cone 


maximize 
subject  to 


^0=y00-^jeNy03tj 
x=ao-i:.     a.t.>0,  t.>0 
x^  integer,  i£N. 


(3.4) 


The  fundamental  relation  between  the  problems 
C  and  Q  is  as  follows:  Let  t*  be  an  optimal  solu- 
tion to  C  and  x*=a.-2:  ,  ,a.t*.     Then  x*  is  an  opti- 
0     ]eN  ]  ] 

mal  solution  to  Q  if  and  only  if  x*  also  satisfies 
those  constraints  of  Q  which  are  not  binding  at 
in 


'0- 


by 


The  j''^  half  line  of  the  cone   (3.4)   is  defined 


C  .  : 

3 


ao-a,t_. 


:    "  : 

The  quantities  a 


t  .>0. 


(3.5) 

jeN,  will  be  used 


Q  ,  a j  and  ca j  , 
in  Section  4  .     The  next  lemma  shows  that  they  can 
be  obtained  from  the  optimal  simplex  tableau  of 
the  linear  program  Q" . 

Lemma  1    Suppose  the  following  mx (n+1)  matrix 


caj=-yQj,   j£N.  (3.6) 

Proof.     The  first  part  of  the  lemma  follows  from 

the  definition  of  a j .     To  derive   (3.6),  we  see  that, 

for  every  t.,   jeN,  y„.=6.-Z.    ^Oy .  .-I. .  „c.y..= 
]  Od     D     leJ     ID     leB  i""  i] 

-Z .  -  c.a.  .-Z.  „c.a.  .=-ca.,  where  6.  is  equal  to 
leB    1  1]     leB  1  1]        3  ]  ^ 

the  cost  associated  with  t.  if  tj  is  a  structural 

variable,  and  is  equal  to  3  if  tj  is  a  slack  vari- 


able,  and  B=N-B.  (Q.E.D.) 


4.    THE  PARALLEL-CUT  ALGORITHM 

In  this  section,  we  develop  the  parallel-cut 
algorithm  first  under  the  assumption  that  the  opti- 
mal solution  of  the  associated  linear  program  Q' 
is  unique  and  non-degenerate.     Non-uniqueness  and 
degeneracy  will  be  discussed  in  Section  4.5. 

The  parallel-cut  algorithm  includes  the 
following  operations : 

1.  Generate  H-points  on  the  half lines. 

2.  Project  the  H-points  onto  the  normal  n. 

3.  Using  the  projections  as  stepping-stones, 
generate  a  set  of  candidates  at  each  step  for 
feasibility  and  optimality  tests. 

The  details  of  these  operations  are  described 
in  the  next  four  subsections. 

4.1  GENERATING  H-POINTS  ON  THE  HALFLINES 

The  use  of  H-points  was  first  developed  by 
Glover  [7,8].     However,  he  used  them  for  generating 
cuts  directly;  whereas  we  use  them  for  generating 
projections  and  candidates.     Let  us  first  restate 
without  proof  one  of  his  lemmas,  using  our  nota- 
tions. .-■ 

Lemma  2     (First  cut-search  lemma  of  Glover) 

Assume  x'   is  contained  in  the  truncated  cone 
of  points  satisfying  both  (3.4)  and 


Ej^,(l/t*)t..l, 


(4.1) 


where  t*>0  for  all  jeN.     Then  every  hyperplane 
L(x-x')=0  through  x'    (for  L  a  non-zero  row  vector) 
intersects  at  least  one  of  the  edges  of  the  trun- 
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cated  cone  incident  at  (i.e.  the  line  segments 
x=ao-ijtj,  t*>tj>0). 

In  particular,  let  x'  be  an  integer  point  in- 
side the  cone   (3.4),  t^='»  for  every  jeN,  and 
L=(0,0,  .  .  .  ,0,1,  .  .  .  ,0)  ,  where  1  is  in  the  i^*^  com- 
ponent, i=l,2,...,n.     Then,  Lemma  2  implies  that 
every  coordinate  plane  through  an  integer  point  in 
the  cone  intersects  at  least  one  of  its  half lines. 
The  H-points  generated  on  the  j^h  halfline  are 


U  -    -  u 
? .=a.-a .t .  , 
D    0    3  : 


(4.2) 


where  the  increasing  sequence  of  parametric  values 
of  t.  are  defined  as  follows: 

t!=min{t . I t . >0  and  a.  -a..t,   is  a  nonneqative 
:  r   D-  lO     ID  3 

integer  for  some  i } 

W  +  1         .       r  I  U  -  - 

t .     =mintt .  t . >t .  and  a .  „-a ,  , t .   is  a  non- 
:  3'   D     3  lO     1]  D 

negative  integer  for  some  i}. 

If  there  is  no  value  of  tj  that  can  be  gene- 
rated in  this  way,   the  tj  or  t^  is  defined  to  be 
equal  to  <».     Numerically,  the  values  of  t^  satis- 
fying the  above  definition  may  be  obtained  as 
follows : 


its  distance  from  ag  along  n. 

4.3     SETS  OF  CANDIDATE  INTEGER  POINTS 

In  this  subsection,  we  describe  how  (4.6)  is 
used  to  generate  integer  points  for  feasibility 
and  optimality  tests.     Every  p^cP  corresponds  to 
one  or  several  H-points,  each  of  which  in  turn 
corresponds  to  one  or  several  integer  values  of 
the  coordinates  x^'s.     Hence,  for  each  pj^ ,  n  dis- 
crete sets  of  values  of  the  coordinate  x- 


X^(k)={x?-,x2,.  ..  ,x^},  i£N 


(4.7) 


can  be  uniquely  determined.      (v  depends  on  i  and 
k.)     Candidate  integer  points  are  then  obtained  by 
forming  all  the  possible  combinations  as  follows: 

(x, ,x„ , . . . ,x. , . . . ,x  ),  x.£X.(k),  ieN. 

i     I  1  nil 

The  following  lemma  shows  that  each  of  the 
discrete  sets  Xj^  (k)   consists  of  consecutive  inte- 
gers without  gaps. 

Lemma  3    The  discrete  set  Xj^(k)   can  be  expressed 
in  the  form 

X  .  (k)  ={x.  I  il.  (k)  <x.  <u.  (k)  ,  X.  integer}, 
1  11  11  1 


t°=0,  t^'''"'"=t*^+min{i|).  },  y  =  0,l,2 
3  3         3  ifiN  1 


(4.  3) 


where 


.= 


(  (a. 


13  3 


-<a. 


iO     ij  j 


t.>)/a.  .   if  a.  .<0 


13 


13 


(  (a.  -a. 
lO     ID  D 


t':')-{a.  „-a.  .t':'})/a 
lO     1]  3 


if  a. .>0 
13  13 

otherwise , 


<z>  denotes  the  smallest  integer  greater 
than  z , 

and     {z}  denotes  the  largest  integer  smaller  than 
z . 

4.2     PROJECTING  THE  H-POINTS  ONTO  THE  NORMAL 

Let  n  be  the   (negative)  normal  to  the  ©(ag)- 
plane  at  ag.     The  distance  of  any  point  x  from  the 
0(ag) -plane  is  given  by 


d(x)  Ec(aQ-x)/||cl|, 


(4.4) 


where  II  •  |{  denotes  the  Euclidean  norm.     In  parti- 
cular, the  distance  between  ig  and  the  projection 
of  an  H-point   (4.2)  on  n  is  given  by   (see  Lemma  1) 


d(c'^)  =  (ca./||cll)t^=(-y„  ./\\c\\)\}'.,  j£N. 
3  3  3         O3  3 


(4.5) 


Note  that  the  quantities  yg  .<0  can  be  obtained  from 
the  optimal  simplex  tableau-'of  the  problem  Q"  .  If 
ag  is  the  unique  optimal  point,  we  have  ygj<0  for 
every  j£N. 

Let  us  denote  these  projections  by  an 
increasing  sequence 


P={Pg,Pl,P2  Pk'---}' 


(4.6) 


where  Pg=0,  p.^P^^j.  When  there  is  no  confusion, 
p^  is  used  to  denote  both  a  projection  image  and 


where  il,  (k)  and  u.  (k)  are  non-negative  integer 
bounds . 

Proof.  Without  loss  of  generality,  we  may  assume 
that  the  elements  of  Xi(k)  satisfy 

I  7  Q  Q+1 

0<xJ <xt< . . . <xr<xr    <...<x..     Suppose  there  is  a 

II  11  1 

gap  of  size  g   (>2)  between  x?  and  x?     ,  i.e. 

q+1     q  q        ,  q+1 

X.     =x.+g.     Let  x?  and  x.       correspond  to  two  H- 
11  1  1 

points  on  the  halflines        and  Cg  respectively 

(the  possibility  that  a=6  is  not  excluded),  i.e. 

there  exist  t*>0  and  •t-'>0  such  that 


x'?=a.„-a.   t*,  x?^''"=a  .  „-a .  „  t '  . 
1     lO     la  a       1         lO     iS  g 

We  distinguish  between  two  cases: 


(4.8) 


(1)  x'^''""'">a.  .+1;  and   (2)   x'^^'''<a .  „+l .     For  the  case 
1        lO  1    =  lO 

(1),    (4.8)   implies  that  -a^gtpl.  Define: 

t°=tl+l/a.„  and  x°Ha . „-a , „t°.     It  follows  easily 

1  o"^  a^l  +1  q 

that  t;>t°>0  and  x?    >x°=x*^    -l>x'^    -g=x*'.  This 
66  111  11 

—    —  o 

means  that,  on  C. ,  we  can  find  an  H-point  a^-a.t^ 

(in  front  of  the  H-point  a  -at')  which  generates 

+  1 

the  integer  x?  between  x?  and  x?    a  contradic- 

1  1  i 

tion.     Similarly  for  the  case   (2) ,  we  can  find  on 

C^  an  H-point  which  generates  an  integer  between 

x"^  and  x?"^-"-.  (Q.E.D.) 
1  1 

As  a  result  of  Lemma  3 ,  it  is  computationally 

much  easier  to  generate  candidates.     Instead  of 

(4.7) ,  we  simply  keep  track  of  a  pair  of  bounds 

I.  (k)  and  u. (k)   for  every  x. ,  i£N,  and  every  p,  , 
1  1  1  k 

k=0,l,2,...,  using  the  following  stipulation: 

(i)    i.  (-l)  =  [a.  J,  u.(-l)  =  |a,    I.  (4.9) 
1  lO         1  LiOJ 

(ii)  Let  S_^(k)  be  the  set  of  integral  x^ 


34 


I.  (k)= 
1 


values  associated  with  those  H-points 
which  are  projected  onto  pj^.  Then 

min  {  8, .  (k-1)  ,x .  }  if  S  .  (k)    is  not  empty, 

111 

x.eS.(k) 

^  '  (4.10) 

i^(k-l)  if  S^(k)   is  empty, 


u.  (k) 
1 


max  {u.(k-l),x.}  if  S.(k)  is  not  empty 

x.eS.Ck)  111 

I         ^  (4.11) 

u^(k-l)  if  S^(k)  is  empty, 

where  [zj    (or  [z])   is  the  largest  (or 

smallest)   integer  which  is  smaller  (or 

greater) than  or  equal  to  z.     Note  that 

there  exists,  for  each  k,  at  least  one  i 

such  that  either  i  .  (k)  <  {, .  (k-1)  or 

u. (k) >u. (k-1) .         ^  ^ 
1  1 

Hence,  the  set  of  candidates  generated  by 
{p^  ,p^  ,  .  .  .  ,Pj^  }  ,   is  of  the  form 

I(k)  =  {xk.  (k)<x.<u.  (k)  ,  x.   integer,  (4.12) 
1  11  1  •  .,-1 

4.4     FEASIBILITY  AND  OPTIMALITY  TESTS 


At  the  projection  p^^,  our  subproblem  is 


maximize  x„=cx 


l^subject  to  Ax<b ,  x€l(k) 


(4.13) 


A  direct  or  algorithmic  search  may  be 
required  to  determine  whether   (4.13)   is  infeasible 
or  has  an  optimal  solution.     If  infeasible,  we 
proceed  to  the  next  projection  P]^^-|^-  Otherwise, 
(4.13)   has  an  optimal  solution.     One  of  the  main 
features  of  our  parallel-cut  algorithm  is  pro- 
vided by  the  next  result. 

Theorem  4    Suppose  x(k)   is  an  optimal  point  of 
(4.13).     If  d(x(k))<Pj^,  then  x(k)   is  also  an 
optimal  point  of  the  integer  linear  program  0. 

Proof.  We  shall  prove  this  theorem  by  showing  the 
contradiction  that,  if  x'  is  a  feasible  point  of  0 
satisfying  cx(k)<cx'.     Then  x'el(k). 


algorithm  follows: 

Step  0  Solve  the  associated  linear  program  Q'  by 
the  simplex  method  and  let  a^  be  its  opti- 
mal point.     If  ajj  is  integral,  it  is  also 
an  optimal  point  of  the  integer  program  Q. 
Otherwise,  go  to  Step  1. 

Step  1  (Initialization) 

1.1  Formulate  the  cone  problem  C.  (see 
Lemma  1  of  Section  3) . 

1.2  On  each  of  the  half lines  C.,  jeN,  gen- 
erate a  sequence  of  H-poinis  by  (4.2) 
and  (4.3).     For  each  of  the  H-points, 
keep  track  of  its  integer  coordinate (s) 

1.3  By   (4.5),  project  perpendicularly  the 
H-points  onto  the  normal  n  and  arrange 
the  sequence  of  projections  in  increas- 
ing order  or  magnitude 


P=-tPo'Pi  'P 


2'  ■ 


where  Pn=0,  p. <p. 


^0""'  ^i+l 
1.4  Let  k=-l. 
Step  2    (Moving  the  parallel  cut  into  the  cone) 

2.1  Let  k=k+l 

2.2  Solve  (4.13).  If  (4.13)  is  infeasible 
or  has  an  optimal  point  x(k)  such  that 
d(x(k))>Pj^,  repeat  Step  2.  Otherwise, 


the  optimal  point  x (k) ,  for  which 
d(x(k))<jp  ,   is  also  an  optimal  sol 
tion  of  the  integer  program  Q. 


In  practice,  the  following  refinements  may 
be  incorporated  into  the  above  formal  description 
of  the  algorithm. 

Since  the  algorithm  may  terminate  early  in 
the  process ,  it  would  be  a  waste  of  time  to  gen- 
erate too  many  H-points  and  projections  at  the 
outset.     Instead,  we  may  specify  intervals  on  n 
and  the  halflines  and  generate  the  H-points  and 
projections  within  these  intervals  one  after 
another  only  when  they  are  needed.     Suppose  H  is 
the  size  of  the  interval  on  n,  then  the  corres- 
ponding interval  size  for  t_.  is  -il  lie  ||/yp_.  ,  jeN. 

Also,  Step  2  of  the  algorithm  may  be  replaced 
by  the  following  more  elaborate  one: 


Consider  the  halfspace  associated  with  the 


O  (pj^)  -plane 


^j.N'^/*j'*j=^'  ^j=°' 


(4.14) 


where  t*=Pj^ l|c||/ (-y^ ^  )  >0  ,     jeN.      (See   (4.5)  and 

[1].)     iuppose  there  exists  a  feasible  point 

x'San-I.  .,a.t'.  of  Q  such  that  cx(k)<cx'.     Then,  by 
0     leN  3  3 

(4.4),  we  have  d  (x ' )  =c  (a^-x ' ) /II  c  11  = 

c  (  [ag-x  (k)  ]  +  [x  (k)  -X  ■  ] )  /II  c  || <d  (x  (k)  )  <p^  and 

t^<d(x"  )  ||c||/(-y|j  J<t^,   jeN.     This  implies  that  x' 

satisfies   (4.14).     By  Lemma  2,  every  coordinate 
plane  through  x'   intersects  a  half line  C.  at 
a„-a.t°,  say,   for  some  t°<t^.     By  definiiion  of 

(Q.E.D.) 


0     3   3-  3=  3 

I(k),  we  have  x'el(k). 


4.5     DETAILED  DESCRIPTION  OF  THE  PARALLEL-CUT 
ALGORITHM. 

A  detailed  description  of  the  parallel-cut 


Step  2 '  (Moving  the  parallel  cut  into  the  cone) 
2.1'  Find  the  smallest  k,  say  k  ,  such 
that  the  set  {x|Ax<b,  xel (k  )}  has 
at  least  one  feasible  point.  (If 
such  k  does  not  exist,  the  integer 
program  Q  does  not  have  a  feasible 
solution . ) 

2.2'  Solve  (4.13)  with  k=k°.     Let  x(k)  be 

the  optimal  point. 
2.3'   If  d(x(k))<pj^,  then  x(k)   is  also  an 

optimal  point  of  Q.     Otherwise,  go 

to  Step  2.4'. 
2.4'  Set  k=k+l.     Test  whether  the  set 

T (k) ={x |Ax<b,     xel (k) -I (k-1) , 
cx>cx (k-1) } 

is  feasible.     If  not,  define 
x(k)=x(k-l)   and  go  to  Step  2.3'. 
Otherwise,  go  to  Step  2.5'. 
2.5'  Solve  the  problem  max{cx |xeT (k) }  by 
any  direct  or  algorithmic  searching 
method.     Let  x(k)  be  the  optimal 
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point.     Go  to  Step  2.3'. 

4.6     DEGENERACY,   NON-UNIQUENESS  AND  FINITENESS 

In  this  subsection,  we  show  that  the  parallel- 
cut  algorithm  works  without  any  modification  when 
the  optimal  solution  of  the  associated  linear  pro- 
gram Q'  is  degenerate.     We  also  discuss  the  rela- 
tion between  the  finiteness  of  the  algorithm  and 
the  uniqueness  of  this  optimal  solution. 

Suppose   (ag,s)   is  the  optimal  point  of  0". 
Degeneracy  occurs  when  some  of  the  basic  compon- 
ents of  ajj  or  s  have  zero  value.     Define  the  index 
sets 

B°={i|a.^  basic  and  a,  =0}; 
'   lO  lO 

M°={i|s,  basic  and  s.=0}. 
'   1  1 

Geometrically,  in  the  n-space  of  the  structural  vari- 
ables, degeneracy  implies  that  the  hyperplanes 

associated  with  the  halfspaces  Z.    ,a. .x.<b.,  ieM 

^  ^  jeN  1]  1=  1 

or  x.>.0,  i£B     also  pass  through  a^.     In  the  same 
space,  let  us  define 

X={x|Z.  „a..x.<b.,   ieM;  x.>0,  jeN} 
'    3eN  xj   D=  1  J= 

and 

X'  =  {:<|S.     a..x.<b.,   ieM-M°,-   x.>0,  jeN-B°}. 

'    ]£N  13   3=  1  j= 

Thus,  X'   is  obtained  from  X  by  dropping  the  con- 
straints associated  with  those  variables  (struc- 
tural or  slack)  which  are  basic  but  have  value  0. 
Obviously,  X<=X '  .     Balas   [1]   proves  the  following 
lemma . 

Lemma  5  a^  is  the  vertex  of  X'  and  cx=ca^  is  a 
supporting  hyperplane  for  X'.     X'  has  n  distinct 
edges  adjacent  to  a^,  and  each  halfline   (3.5)  con- 
tains exactly  one  such  edge. 

Interpreted  geometrically  (see  Figure  2), 
this  lemma  implies  that,  in  case  of  degeneracy, 
some  of  the  half lines   (3.5)  may  not  contain  any 
edge  of  the  cone   (incident  at  a^)  of  the  feasible 
region  X  of  the  problem  Q' .     However,  each  half- 
line   (3.5)   contains  an  edge  of  another  polytope 
X'  . 


Figure  2     degeneracy  at  a 


In  connection  with  our  parallel-cut  algorithm, 
it  is  obvious  that  the  halflines   (3.5)   can  also  be 
used  to  generage  H-points,  which  in  turn  are  used 
to  generate  integer  points  to  be  tested  for  feasi- 
bility with  respect  to  X.     In  other  words,  the 
algorithm  works  without  any  modification. 

There  is  a  close  relation  between  the  unique- 
ness of  the  optimal  solution  of  the  associated 
linear  program  Q'  and  the  finiteness  of  the  parallel 
cut  algorithm. 

If  the  optimal  point  of  Q'  is  unique,  the 
following  argument  shows  that  the  algorithm  is 
finite.     Let  x'  be  an  arbitrary  feasible  integer 
point  (assumed  existing)   of  Q.     Then,  the  0(x')- 
plane  intersects  every  halfline  at  a  finite  point 
and  d(x')   is  an  upper  bound  on  the  projections  to 
be  generated.     Thus,  the  numbers  of  distinct 
H-points  and  projections  are  finite.     Hence,  the 
number  of  steps  and  the  number  of  candidates  to  be 
tested  at  each  step  are  both  finite. 

If  some  of  the  relative-cost  factors  y„ .  are 

Oj 

zero,  the  optimal  point  of  Q'  is  not  unique. 
Define  the  index  set  D={j|jeN,  y^  .=0}  and  the  poly- 
tope ^ 

X"={x |x=a„-I .     a.t.,t.>0,  xex}. 

Geometrically,  non-uniqueness  implies  that  X"  lies 
on  the  0(aQ)-plane   (see  Figure  3).     Hence,  all  the 
H-points  and  integer  points  lying  in  X"  are  projec- 
ted onto  slq  ,  and  the  discrete  set  1(9)  as  defined 
in   (4.12)  may  not  be  empty.     We  distinguish  between 
the  following  two  cases : 

(i)   X  is  bounded. 

For  such  a  case,  the  number  of  H-points  gene- 
rated on  each  of  the  halflines  C.,   jeD  is  finite. 
By  similar  argument  as  in  the  uniqueness  case,  we 
can  show  the  parallel-cut  algorithm  is  finite  if 
boundedness  is  incorporated  into  the  definition 
(4.3)  . 

(ii)  X  is  unbounded. 

For  such  a  case,  it  may  be  necessary  to  gene- 
rate infinitely  many  H-points  on  some  of  the  C . , 
jsD.     If  X"  contains  an  integer  point,  Q  becomes 
a  problem  of  locating  any  such  integer  point.  How- 
ever, it  may  happen  that,  theoretically  at  least, 
X"  does  not  contain  any  integer  point  but  there 
exist  feasible  integer  points  of  Q  which  are  arbi- 
trarily close  to  X".      (See  the  remark  below.)  Thi£ 
implies  that,  at  the  initial  projection  p^  of  the 
algorithm,  search  for  the  best  feasible  integer 
point  over  the  infinite  set  1(0)  never  terminates. 
In  practice,  such  situation  may  be  remedied  by 
introducing  an  artificial  constraint   (the  dotted 
line  of  Figure  3) .     Then  an  approximate  solution 
is  sought  over  a  bounded  portion  of  the  unbounded 
feasible  region. 

Remark:     It  seems  to  the  author  that,  theoretically 
speaking,  this  situation  gives  rise  to  trouble  for 
most  of  the  existing  integer  programming  methods. 
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Figure  3    Non-uniqueness  of  optimal  point 

Application  of  the  underlying  ideas  of  the 
approach  presented  in  this  paper  can  obviously  be 
extended  to  mixed  integer  programming,  and  probably 
to  nonlinear  integer  programming  with  linear  con- 
straints . 
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Abstract 

A  monomial  programming  problem  is  one 
of  minimizing  a  polynomial  in  several 
variables  subject  to  monomial  constraints. 
A  log  transformation  changes  it  into  a 
problem  with  non-linear  objective  and 
linear  constraints  which,   under  certain 
conditions  can  be  solved  by  Zangwill's 
convex  simplex  method.     We  show  a  direct 
method,   based  on  our  previous  work,  of 
solving  the  problem  using  a  simplex-like 
tableau . 

1.  INTRODUCTION 

In  a  recent  series  of  papers    [3,  4, 
5,    6],   we  have  presented  simplex-like 
algorithms  to  several  special  kinds  of 
problems  arising  in  modular  design, 
geometric  programming,   and  elsewhere.  In 
the  present  paper  we  shall  specialize  the 
general  algorithm  of    [4]    to  the  case  of  a 
programming  problem  in  which  the 
constraints  involve  monomial  expressions. 
For  this  case  a  true  simplex  algorithm  is 
possible  in  the  sense  that  any  of  the 
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standard  linear  programming  routines  can 
rather  easily  be  modified  to  solve  monomial 
constrained  problems.     Also  all  of  the 
techniques  for  handling  large  scale  linear 
programming  problems,    such  as  the  revised 
simplex  method,   decomposition,  lexico- 
graphic ordering,   etc.,   are  immediately 
transferable  to  the  new  algorithm.  It 
therefore  follows  that  monomially 
constrained  problems  in  hundreds  of 
variables  can  be  solved. 

In  Sections  2-7  we  present  a 
description  of  the  new  method.  Computa- 
tional results  are  presented  in  Section  7.' 
A  monomial  model  concernina  excess 
inventories  is  presented  and  solved  in 
Section  8. 


2. 


NOTATION 


Because  we  will  be  working  with 
monomial  expressions  and  their  derivatives 
in  several  variables  we  introduce  some 
special  vector  and  matrix  notation  to  make 
it  easy  to  do  so.     We  define  the  O 
product  of  an  n-component  row  vector  h  and 
an  n-component  column  vector  x  as; 


(i; 


h    O    X  =  (h 


1'  • 


""l  ^2   ...  \ 

""l  ^2 

We  extend  this  definition  in  the  obvious 

way  to  the    O    product  of  an  mxn  matrix  H 

and  an  n-component  column  vector 


(2)     H  O 
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where  h.    is  the  ith  row  of  H.     As  a 
1 

numerical  example  observe  that:      /  \ 
1       0     -1\  /3\  /  9/4  \ 

(3)  (  0       1       J        O        1    I  = 

0       0       2/  \4/3/  y4/3)y 


Using  the    Q    product  and  the 
ordinary  matrix  product  we  can  easily 
write  a  polynomial  in  several  variables; 
let  d  be  an  m-component  row  vector,   H  an 
mxn  matrix,   and  x  an  n-component  column 
vector;  then; 


(4)     d(H    O    x)   =  d^x^^^.  .  .x^-""  + 


In 


.+  d  X, 
m  1 


ml 


mn 


is  a  polynomial  in  the  variables. 

We  recall  next  the  definition  of  the 
Schur  product  of  vectors.     Let  a  and  b  be 
two  m-component  row  vectors;   then  the 
Schur  product  is  defined  as: 
(5)     a*b  =  (a^,...,a^)*{b^,...,b^)= 

(^1^ ^m^m'- 
Using  this  we  can  write  the  derivative  of 

a  polynomial  d(HOx).     Let  h'^'   be  the 


(j)  ' 


be  its 


jth  column  of  H  and  let  h 

transpose;    let  h'-^'   be  the  matrix  H  with  1 

subtracted  from  each  entry  in  the  jth 

dx'^  n-1 
column.     Since  tne  formula  - 


dx 


nx 


works  even  when  n=0,   it  follows  that; 
3  (j)  '  w„(j) 


(6) 


3x. 


(d(H  O  x)  )    =    (d*h^^'    )  (H^J  '  Ox) 


From  this  it  further  follows  that: 

3 


(7)  x 


j    3X . 


(d(H  O  x)  )  = 
(d*h'^^ ' )  (H  O  x) . 


^123^-,45 
As  an  example,    f  =  6x-|^X2X2  +  '^]^^2 

can  be  expressed  in  the  form  of  equation 

(4)   with  d  =    (6,7)   and  H  =  0       ^       ^  . 

3f         ^23^_„35        ,        3f  ,123, 

  =  6x^x^  +  28x,Xt  and  x,-r        =  6x,x_x-,  + 

3x^  23  12  13 x^  123 

4  5 

28x^X2     can  then  be  expressed  as  m 

(1 )  ' 

equations    (6)   and    (7)   with  h  ^         =  (1,4) 

,  „(1)       /O       2  3^ 
and  H        =  (3       5       0)  • 


3.        PROBLEM  STATEMENT 

By  a  monomially  constrained  problem 
we  shall  mean  a  problem  of  the  form 
Minimize         d (C  O  x)   =  g 

(8 )     Subject  to     A  O  X  ^  b 

X  ^  0 

where  d  is  Ixk,  c  is  kxn,  A  is  mxn,  x  is 
nxl,   and  b  is  mxl.     We  assume  b>0. 


Note  that  by  making  the  transforma- 


tion : 


(9)     x.   =  e 

: 

and  taking  logarithms  we  could  change    (8 ) 
into  the  problem   (using  the  obvious 
definitions) : 

Minimize  d(C0e^) 

(10) 

Subject  to     Ay  >_  Inh 
Problem   (1)   has  linear  constraints,   but  y 
is  not  constrained  to  be  nonnegative. 
Following  Charnes  and  Kirby    [1]   we  then 
say  that    (8)    is  transformably  linear. 

The  following  facts  are  well  known. 

If  d>0  then  d{cQe^)    is  a  strictly  convex 

function  of  y,,...,y   .     If  d>0  then  the 

-'I  n  — 

same  function  is  a  convex   (but  not 
necessarily  strictly  convex)    function  of 
these  variables. 

If  d(cOs"'')    is  a  convex  function  of 
y  we  could  use  Zangwill's  convex  simplex 
method    [9]   to  solve  problem   (10)   and  thus 
(8).     However  in  this  paper  we  shall 
specialize  our  previous  work    [4]  on 
simplex-like  methods  for  solving  nonlinear 
problems  with  nonlinear  constraints  to 
provide  a  simplex  method  that  is  somewhat 
more  general  than  the  convex  simplex 
method  for  solving  problem   (8)  directly. 

Let  A  =    ( ]_'•••' ""^i^)  Ixm  vector 

of  Lagrange  multipliers   for  the  constraints 
of   (8).     Then  the  Lagrangian  of    (8)  is: 
(11)   L  =  d(C  O  X)   -  A (A  O  x-b) 
where  A  Q  x  =  h  now  is  considered  to 
include  the  constraints  x>^0 .     Taking  the 
partial  derivative  of  L  with  respect  to 
Xj ,   using   (6),   and  setting  the  result  equal 

to  zero  yields: 


3L 


(12)   ^  =    (d*c'^'  ')  (C*^'  O  X)  - 


3x 


(A*a^^^ ') (A*^'  O  X)    =  0 


Multiplying   (12)   by  x.   and  use    (7)  to 
obtain : 

(13)  (A*a*^^  ' )  (A  O  x)    =    {d*c*^'  ' )  (C  O  x)  . 
We  now  use  the  well-known  Kuhn-Tucker 
complementary  slackness  condition: 

(14)  A^(a^^^Ox)   =  A^b^  for  all  i 


where  a 


(i) 


is  the  ith  row  of  A     and  define 


(15)    p .   =  A . b.      for  all  i . 

Substituting    (14)   and   (15)    into   (13)  gives: 


(j)  _ 


(16)    pa'-''   =    (d*c^^^    )  (C  O  x)    for  all  j. 
We  shall  call  p .    the  dual  variables  and 

(16)    the  dual  equations. 

As  in  linear  programming  duals 
associated  with  constraints  of  the  type 
x>0  must  be  zero  for  basic  variables  so 
that  at  the  optimum   (16)   must  be  satisfied 
for  the  original  m  constraints.     If  we  let 
the  right  hand  side  of    (16)   be  q-^  for  the 
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basic  variables  then  in  matrix  form  pB  = 

and  p   =  qB      where  B  is  the  basis.  Like- 
wise ignoring  the  nonnegativity  con- 
straints   (16)  becomes: 

(d*c'^' ' ) (C  O  x)   -  pa^^'   ^0  for 
nonbasic  variables. 

The  left  hand  side  of  this  inequality 
corresponds  to  the  "reduced  cost",   r ^ , 

values  in  linear  programming,  or 
expl icitly : 


(16b)    r  j  =  (d*c (C  O  x)  -qfi-'a 

all  j  nonbasic, 

b"-'"  a*^'    is  me 
the  present  tableau. 


1  (j) 


for 


where  B       a'^'    is  merely  the  jth  column  of 


4.        RELATIONSHIP  TO  LINEAR  PROGRAMMING 

Note  the  similarity  between  the 
ordinary  linear  programming  dual  equations 
and    (16).     If  the  ith  constraint  in    (8)  is 
not  tight         =  0  and  if  it  is  tight  P  j,   ^  0. 

Hence  we  can  use  the 


p^'s  or  equivalently 


A  .  = 
1 


in  the  same  way  as  they  are  used 


in  ordinary  linear  programming. 

If  we  assume  that  one  of  the  monomial 
constraints  in    (8)    is  tight  and  use  it  to 
eliminate  a  variable,    say  x.,    from  the 
rest  of  the  equations  then     it  is  not  hard 
to  see  that  this  transformation  will  make 
linear  changes  in  the  exponents  of  the 
other  monomials  and  log  linear  change  in 
the  right  hand  sides. 

For  instance,    consider  the  polynomial 
equations : 
2 

x^x^  =  3 
^1^2  =  ^ 


^1^2 


=  4 


Using  the  notation  of  the  previous  section 
these  can  be  written  A  Q  x  =  b,  or  in 
detached  coefficient  form  as: 


where  A  is  the  3x2  matrix  on  the  left  and 
b  is  the  3x1  vector  on  the  right.     If  we 
now  want  to  use  the  second  equation  to 
eliminate  x^   from  the  other  two  equations 

we  can  multiply  by  the  corresponding  pivot 
matrix   (see    [2])    in  front  as  follows: 


2 

1 

2 

3 

0 

3 

1 

2 

1 

9 

1 

1 
2 

3 

1 

1 

4 

0 

1 
2 

4 

3 

Note  that  we  used  ordinary  matrix  multipli- 
cation in  the  A  area  of  the  tableau,  but 
O  multiplication  in  the  b-area.     All  the 
familiar  rules  of  pivoting  in  linear 
programming  apply  if  they  are  appropriately 
modified  for  operation  on  the  right  hand 
side  . 

Most  of  the  problems  encountered  in 
solving  ordinary  linear  programming  pro- 
blems such  as  degeneracy    (which  can  be 
handled  by  perturbation  techniques  or 
lexiographic  orderings)   can  occur,  and 
some  new  difficulties  as  well.  For 
instance  it  is  possible  that  one  of  the 
factors  of  the  objective  function  could 
tend  to  zero  without  a  basis  change  being 
needed.     To  handle  such  problems  we  can 
employ  regular i zation  constraints  of  the 

form   (C  O  x) .  >  k .  >  0  where  k .    is  a  small 
^11  1 

number.     Most  of  the  other  difficulties 

can  be  handled  by  similar  modifications  of 

well  known  linear  programming  techniques 

and  will  not  be  discussed  further  here. 

We  next  describe  a  simplex  method  for 
solving  problems  of  the  form   (8).     To  start 
it  we  need  a  Phase  I  procedure.     To  this 
end  we  add  slack  variables  z.  and 


artificial  variables 
of 


u.   to  the  constraints 
1 


(8)   to  obtain  the  equivalent  problem: 
'Minimize  d(C{3'x) 


(17) 


Subject  to 


(A, -I, I)  O 


.  u . 


1'  1- 

Let  f  be  a  column  vector  of  all  ones. 
If  b>^f  it  follows  that  an  initial  feasible 
solution  is  x=f,   u=b,   so  that  the  initial 
tableau  for  the  simplex  Phase  I  start  is: 
x  z 


(18) 


u 


If,  however,  some  components  of  b  are  <1  ^ 
then  the  initial  feasible  basis  can  be  made 
up  of  the  components  of  u^  for  which  b^>l 

and  the  components  of  z^  for  which  b^;^l . 

We  now  use  a  Phase  I  objective  function 
e  Q   u   (where  e  is  an  m-column  vector  of 
all  ones)   and  pivot  until  all  the  u^ 

artificial  variables  have  been  eliminated 
in  complete  analogy  to  ordinary  linear 
programming.  | 

In  the  next  section  we  describe  in 
detail  the  simplex  method  for  monomially 
constrained  problems  which  is  based  on  the 
general  method  described  in  our  paper  [4]. 
The  main  differences  from  the  ordinary 
linear  programming  simplex  method  are  that 
the  variables  are  not  constrained  to  be  >_1 
and  second  that  nonbasic  variables  can 
take  on  values  other  than  1  or  0 .  The 
latter  is  necessary  in  order  that  we  can 
find  optimun  solutions  that  are  not  neces- 
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I 


sarily  determined  by  intersections  of 
constraining  hyperplanes.     We  use  this 
method  of  describing  the  algorithm  rather 
than  the  parametric  programming  method  of 
[4]   because  it  seems  to  be  simpler  in  the 
present  context. 

5.        THE  SIMPLEX  ALGORITHM  FOR  MONOMIALLY 
CONSTRAINED  PROBLEMS 

The  description  of  the  algorithm  is 
similar  to  that  for  ordinary  linear  pro- 
gramming.    All  of  the  variants  of  the 
latter  can  be  made  to  the  present 
algorithm  without  difficulty  so  we  do  not 
go  into  detail  in  their  discussion  here. 

Phase  I.  Set  up  the  problem,  find 
the  quantities.  A,  b,  c,  and  d  and  sub- 
stitute them  into  the  initial  tableau  (18). 

(A)     Go  to  the  MAIN  routine  usina  the 


objective  function  g  =  u-|^U2...u^.     If  the 

problem  has  a  solution  with  value  greater 
than  1  then  the  original  problem  has  no 
solution;    stop.     If  the  problem  has  a 
solution  with  value  1  then  all  the 
artificial  variables  are  nonbasic    (or  can 
easily  be  made  nonbasic  and  eliminated 
from  the  tableau) . 

Phase  II.     Set  g  =  d (C  O  x)   and  go  to 
the  MAIN  program.     When  that  program  is 
complete  it  will  provide  an  answer  to  the 
originally  stated  problem,   or  to  the 
regularized  problem  if  infinite  solutions 
are  found. 

MAIN  program. 

1.  Dual  Solution  Routine 

(a)  For  each  basic  variable  v. 

1 

calculate  the  value  of  q.  at  the 
current  solution. 

(b)  For  each  nonbasic  v_.  calculate 

the  reduced  cost,    r ^ ,  using 

(16b)   where  the  second  term 
sumation  is  to  be  taken  over  the 
basic  variables  and  hence  the 
sum  calculation  can  be  done  from 
the  current  tableau. 

2.  Find  Incoming  Vector. 
Calculate  the  sets 


^1  = 


V.   is  non  basic,  v. 

3  J 

is  tight  at  a  lower  bound  and 
r^   >  0.} 

S_  =   {v.lv.   is  non  basic,  v. 

is  not  tight  at  a  lower  bound 
and  r  .   7^  0 .  } 

: 


If        U  S2  = 


the  current  solution  is 


optimal . 

Go  to  7.      Else  go  to  3. 

Find  largest  indicator. 

Find  j   as  the  index  that  maximizes 

Max{Max   |  r,  |  ,  Max   |r  v  |}. 
keS^     ^       keS^     ^  ^ 

If  jeS^  or  j£S2  and  r_.>0  go  to  4. 

If  jeS^  and  r.<0  go  to  5. 

Find  solution  with  larger  v.. 

: 

(a)  Find  the  maximum  extent  v_.  to 

which  V.   can  be  increased  while 
3 

keeping  feasibility.     This  can  be 

done  by  using  the  column  of  the 

current  tableau  under  v.   and  the 

D 

right  hand  side  column. 
Determine  outgoing  vector  Vj^. 

(b)  Solve  the  auxiliary  problem  for 

variable  v.;    let  v.   be  its 

3  J 
optimum  value. 

(c)  If  v"']^  <_  v'?  go  to  6 . 

(d)  If  V.   >  v.,    set  v.=v.   m  the 

3         3  3  3 

tableau,   calculate  the  modified 
right  hand  sides  and  go  to  1. 

Find  solution  with  smaller  v.. 

: 

(a)  Find  the  maximum  extent  v.  to 

■3 

which  V.   can  be  decreased  while 
3 

keeping  feasibility.     This  can  be 

done  by  usina  the  column  of  the 

current  tableau  under  v .   and  the 

] 

right  hand  side  column. 
Determine  the  outgoing  vector  Vj^. 

(b)  Solve  the  auxiliary  problem  for 
variable  v.;   let  v^  be  its 

:  : 

optimum  value. 

(c)  .  I f  v j  2^  V j  go  to  6  . 

(d)  If  V.   <  v*?,    set  v.=v?  in  the 

:         :  3  3 

tableau,    calculate  the  modified 
right  hand  side  arid  go  to  1. 

Pivot  exchange:     make  v^  basic  and  v^^ 

nonbasic.     Let  P  be  the  corresponding 
pivot  matrix   (see   [2]);   then  use  the 

calculation  PA^^^^^'   =  A^^^^'    for  the  - 

A  part  of  the  tableau  and  pQb'°"'"'^'  = 

j^(new)  ^  part  of  the  tableau 

End  of  MAIN  program. 
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AUXILIARY  problem  routine  for  non- 
basic  variable  v.. 

3 

1.  Evaluate  the  objective  function  g  at 
the  current  solution  but  assuming  v^ 

is  a  variable.     Let  g(v_.)   be  the 

corresponding  function. 

2.  Use  a  search    (or  other)    technique  to 
find  the  unconstrained  minimum  value 

Vj  of  g (Vj ) . 

6 .  EXAMPLES 

We  work  two  examples  in  order  to 
illustrate  the  method.     We  state  first  the 
constraints : 
2  , 
^1^2  ^  ^ 
2 

x^X2  ^  9 

(19)  ^1^2  -  ^ 
x^,X2  ^1 

The   feasible  region  is  shown  in  Figure  7. 
Adding  slack  and  artificial  variables 
these  become. 

2  -1 
x-j^X2Z     Uj^  =  3 

2  -1 

(20)  ^1^2^2  ^^2  ^ 

^1^2"3^^3  =  ^ 
The  constraints  x-|^>^l,   ^2—-^  will  be  imposed 

by  the  method  in  order  to  keep  tableaus 
small.     We  now  use  the  Phase  I  objective 
function  u^u^u^  and  do  the  Phase  I  calcula- 
tions in  condensed  tableau  form   (Figure  1). 

v.Sg. 

It  IS  easy  to  show  that  —          =  u,u_u^  for 

8v.  12  3 

1 

i  =  1,2,3.     The  corresponding  dual 
varialbe  calculations  appear  on  the  right 
and  below  the  tableau.     The  pivot  element 
is  circled;    the  new  tableau  with  variable 
u-j^  dropped  appears  in  Figure  2.     The  Q 

operation  was  performed  on  the  right  hand 
side.     There  is  one  positive  indicator 
(reduced  cost)    in  column  2.     The  second 
and  third  rows  indicate  the  relationships 


3-2  1 
^2^1  ^2 


^1"1 


when         is  decreased  U2  and  u^  are 

decreased.     Since  we  don't  want  U2  to  go 

below  1  we  pivot;   the  new  tableau  with 
variable  U2  dropped  appears  in  Figure  3. 

Using  the  same  reasoning  as  before  we 
pivot  on  the  circled  ^  in  Fiqure  3  and 
obtain  the  primal  feasible  tableau  of 


The  solution  x-^^  =  9/4,   X2  =  16/% 


Figure  4 
=  64/ 

is  feasible  for  the  constraints  (20) 


z^  =  64/27,    Z2  =  z^  =  1  and  '-'j  ~  ^2  ~  ^3 


Suppose  now  we  wish  to  solve  the 
problem: 

4 

Minimize  x  x     =  g(x  ,x  ) 
(21)  ■'■ 

Subject  to  constraints  (19). 

The  tableau  of  Figure  4  with  dual  variable 

calculations  for  this  objective  function  is 

shown  in  Figure  5.     If  we  increase     z^  we 

-1  2  2 

have  x^  =    (9/4) z^     and  X2  =    (4/3)    z^  so 

that  g(z^)   =   (9/4) ^ (4/3) ^z~^  so  that 

increasing  z^  decreases  q{z^)  indefinitely. 

But  the  limit  on  increases  of  z^  is  the 

requirement  x-|^^l .     Therefore  we  pivot  on 

the  circled  1  in  Figure  5  giving  the 
tableau  of  Fiqure  6. 


X2  =  9, 


Hence  the  optimum  solution  is  x-|^  =  1, 
1,   z^  =  9/4  which  can 


^1  "  ^'^ '  ^2 


be  seen  in  the  graph  of  Figure  7.  The 
value  of  the  objective  function  is 
g(x^,X2)   =  9. 

We  now  solve  a  problem  with  the  same 
constraints  but  a  different  objective 
function  in  order  to  show  the  solution  of 
the  auxiliary  problem.     The  new  problem  is: 


(22) 


Minimize  x^X2  2x^^X2 
Subject  of  constraints  (19) 


I 


2 

-1 

0 

0 

3 
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1 
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-1 

0 

9 
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1 

1 

0 

0 

-1 

4 
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-108 

Figure  1 


42 


J 


r .  = 
J 
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^1 

"2 

^3 

2 

-1 

0 

(: 

3 

0 

-3 

0 

-1 

0 

^  "2 

"l"2"3 

=  4/3 

-1 

1 

0 

-1 

4/3 

"l"2"3 
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^3 

-1 

1 

9/4  = 

1 

-3 

(4/3)-^  = 

1 

-2 

(4/3)2  ^ 

Figure  4 


"2 

'3 

-1 

0) 

1 

-3 

1 

-2 

-3(9/4)^(4/3)2 
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Figure 

9/4 


(4/3)- 
(4/3)^ 


'^i  = 

4x^X2  =  4(9/4)^(4/3)2 


4  4  2 

x^x,  =  f9/4)  (4/3) 


9/4  = 

(4/3)^(9/4)^ 
(4/3)2(9/4)2 


4  4  4 


^1  ^2 


Figure  6 
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Figure  7 
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The  tableau  of  Figure  4  with  dual 
variable  calculations  for  the  new  objec- 
tive function  is  shown  in  Figure  8.  We 
see  that  we  should  try  to  increase  as 

in  the  previous  example.     From  Figure  8  we 

-1  2  2 

see  that        =   {9/A)z^     and        =  (4/3) 

so  that  the  auxiliary  problem  is  to: 

Minimize  gCz^)  =   (9/4 ) ^ (4/3) ^z^^  + 

2(4/9) (4/3)^73. 
Differentiating  g  with  respect  to  z^, 
setting  g' (z^)   =  0  and  solving  gives  the 
solution : 


x^  =  1.25 
=13.97 


X2  =  5.80 
Z2  =  1, 


Z3  =  1, 


81 


g(x^,X2)   =  23.28 

The  final  dual  solution  shows  that  this 
solution  is  optimal  as  shown  in  Figure  9. 
Figure  7  shows  the  graph  of  g(Xj^,X2)  = 

23.28  touching  the  boundary  of  the 
feasible  set  at  the  optimum  point. 

7.       COMPUTATIONAL  RESULTS 

Some  computational  results  are  shown 
in  Tables  1,   2  and  3.     The  program  used 
was  written  in  Basic  and  run  on  a 
Burroughs'   6700.     Processing  time  included 
all  central  processing  time  except  that 
used  for  input/output.     The  probem  para- 
meters were  generated  randomly.  The 
exponents  for  the  constraints  were 
integers  between  minus  one  and  three 
while  the  right  hand  sides  were  integer 
numbers  between  one  and  one  hundred.  The 
exponents  in  the  objective  function  were 
random  integers  between  zero  and  four 


multiplied  by  one  half.     Finally,   the  term 
coefficients  in  the  objective  function 
were  random  integers  between  zero  and  nine 

In  the  three  tables  a  pivot  means  the 
same  as  in  linear  programming  i.e.,  a 
basis  change.     A  pass  is  the  operation  of 
modifying  the  value  of  some  variable 
thereby  changing  the  solution.     In  some 
cases,   of  course,   this  modification  will 
lead  to  a  pivot  but  not  always.     As  we 
would  expect  from  linear  programming, 
pivots  seem  to  be  very  fast.     Problems  wit! 
a  high  proportion  of  total  passes  being 
pivot  operations  have  comparatively  low 
solution  times.     As  is  obvious  from 
Table  1  problems  with  many  constraints 
relative  to  variables  are  solved  mostly 
through  pivoting.     This  fact  seems  to 
result  from  the  reduced  feasible  region 
and  the  resulting  high  probability  of 
finding  the  optimum  at  an  extreme  point  or 
at  least,   the  intersection  of  several  con- 
straints.    This  can  be  verified  again  in 
Table  3.     As  can  be  seen  in  Table  2,  more 
complex  objective  functions  also  seem  to 
block  out  extreme  point  solutions. 

In  presenting  these  computational 
results  certain  considerations  must  be 
made  concerning  the  program  itself: 

ONE  DIMENSIONAL  SEARCH  PROCEDURE. 
Since  pivots  can  be  done  rapidly  compared 
to  searching,   the  search  procedure  is  not 
used  unless  pivoting  would  yeild  a  worse 
or  equal  solution  than  the  initial  value 
(with  the  exception  that  for  degenerate 
situations  pivoting  is  always  performed) . 
This  criteria  seems  to  reduce  the  total 
number  of  searches  which  must  be  performed 
thereby  reducing  total  time.  When 
searching  in  undertaken  a  combination 
quadratic  fit  and  Fibonacci  search 


■1  1 
1  -3 
1  -2 


-133.53  86.39 


9/A  =  X 


1 


(4/3)-^  =  z 
(4/3)^  =  X, 


1 
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4(9/4)^(A/3)^-2(9/4)"-^(A/3)^  =  180.67 


(9/4)'^(4/3)^+2(9/4)'\4/3)^    =  47.14 


1.81 


46.56 
0 
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Phase  I  Phase  II  Processing 

Constraints  Variables  Pivots  Variables  Pivots  Passes  Time  (Sec.) 


5 

15 
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15 
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2.5 

c 
D 

OA 

30 

5 

25 

4 

58 

48.0 

1  A 

IV 
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11 

15 

1 

1 
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iu 

O  A 

30 

14 

OA 

20 

5 

6 
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35 

16 

25 

8 

29 
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21 
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30 
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42 

35 
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3 
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03 

4  / 
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40 

6 

6 

13.  8 

25 

70 

54 

45 

15 

15 

20.5 

25 

75 

45 

50 

27 

46 

95.5 

/  u 

c 

J 

c 

J 

10  .  o 

30 

80 

73 

50 

22 

22 

37.5 

30 

85 

75 

55 

23 

23 

39.2 

35 

100 

68 

65 

22 

22 

51.6 

Table  1: 

Time 

results . 

Objective 

function 

has 

five  terms 

-  no  uncons trainted 

variables . 


Terms  in 
Objective  Function 

5 
10 
15 


Phase  I 
Pivots 

5 
5 
5 


Phase  II 
Pivots  Passes 

4  6 
3  5 
3  35 


Processing  Time 
(Sec.) 

2.5 

3.9 
73.7 


Table  2:     Time  results.     5  constraints 

20  variables  in  Phase  I;  15  variables 
in  Phase  II  -  no  unconstrainted 
variables . 


Unconstrained 
Variables 


Constraints 


Variables  Phase  I 

Phase  I     Phase  II  Pivots 


Phase  II 
Pivots  Passes 


Processing 
Time  (Sec.) 


0 
5 
0 
10 


5 
5 
10 
10 


15 
15 
30 
30 


10 
10 
20 
20 


6 
5 
14 
10 


1 
9 
6 
21 


2.5 
8.7 
3.5 
28.3 


Table  3:     Time  results.     Objective  function 
has  five  terms. 
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procedure  is  used.     This  por 
program  was  written  to  take 
the  speed  of  quadratic  predi 
robustness  of  the  Fibonacci 
Search  is  stopped  only  when 
tolerance    (twelve  significan 
the  objective  function  is  re 
the  predicted  points  are  so 
the  objective  functions  are 
in  twelve  significant  digits 


tion  of  the 
advantage  of 
ction  and  the 
search . 
the  machine 
t  digits)  in 
ached,  i.e., 
close  that 
equal  to  with- 


STOPPING  RULES.      Stopping  rules  in 
nonlinear  programming  are  by  necessity 
extremely  complex.     Straight  limits  on 
the  maximum  allowable  size  of  the 
reduced  costs  associated  with  movable 
variables  is  difficult  since  the  impor- 
tance of  these  reduced  costs  can  only  be 
considered  in  terms  of  the  absolute  size 
of  the  objective  function.     Also,   when  the 
one  dimensional  search  is  used  the  result- 
ting  minimum  solution  is  only  an  approxima- 
tion which  leads  to  nonzero  reduced  costs 
associated  with  the  last  variable  moved. 
(Exact  solutions  in  the  liner  search  would, 
of  course,    lead  to  a  zero  reduced  cost  for 
this  variable) .     Typically  even  small 
deviations  in  the  linear  search  seem  to 
yield  large  nonzero  reduced  costs  -  thus 
the  choice  of  as  fine  a  tolerance  as 
possible  in  the  search  portion  of  the 
program.     From  these  two  considerations 
two  stopping  rules  were  developed  both  of 
which  lead  to  program  termination.  The 
first  rule  is  used  after  a  variable  is 
modified  but  no  pivot  has  occurred.  The 
reduced  cost  associated  with  this 
variable  should  be  zero  but  will  never 
actually  equal  zero.     During  the  next 
pass  the  reduced  costs  are  tested  against 
the  reduced  cost,    say  r^,   of  the  variable 

just  modified.     If  no  other  variable 
which  is  a  candidate  for  modification 
under  section  3  of  the  algorithm  has  a 
reduced  cost  whose  absolute  value 
exceeds  2r     then  the  program  stops  and 

the  present  solution  is  printed  out  as 

optimum.     The  second  criteria  is  invoked 

after  a  pivot.     This  rule  stems  from  the 

fact  that  the  reduced  cost  is  v.3g/3v.. 

J  3 

We  wish  to  stop  when  the  objective 

function  is  changing  by  extremely  small 

amounts  relative  to  the  objective 

function.     In  our  program  if  no 

candidate  for  modification  has  a 

reduced  cost  whose  absolute  value 

-7  -5 
exceeds  10       g(x)   or   (10       whichever  is 

greater)   then  the  program  stops  and  the 
present  value  is  printed  out  as  the 
optimum.     The  stopping  rules  for  the 
computational  results  displayed  in 
Tables  1,    2  and  3  were  purposely  very 
stringent.     Figure  10  shows  the  converg- 
ence properties  of  two  of  the  longer  runs. 
In  these  situations  the  objective 
function  levels  off  very  quickly.  These 
results  indicates  that  for  practical 
purposes  the  times  shown  are  quite  high. 


CODING  EFFICIENCY.     Use  of  equation 
(7)   when  calculating  partial  derivatives 
enhances  the  program  considerably  since 
for  a  given  solution  the  bulk  of  the 
calculation  H  O  x  stays  constant  across  all 
variables.     It  should  be  pointed  out  that 
similar  efficients  were  not  coded  into  the 
search  routine.     In  particular  the 
objective  function  is  reevaluated  for  each 
new  point  of  the  search  procedure  even 
though  only  m+1  variables  of  the  total 
variables  are  being  modified.     The  linear 
search  itself  seems  to  be  time  consuming 
which  suggests  that  improvements  here  are 
possible.     Indeed  it  may  prove  to  be  true 
that  high  precision  in  these  searches  is 
not  necessary  early  in  the  solution 
procedure . 

8.        EXAMPLE  PROBLEM 

Due  to  exogenous  economic  factors  an 
automobile  manufacturer  is  faced  with  a 
large  inventory  of  completed  new  automobiles. 
The  company  sells  three  basic  types  of 
cars  -  large,   medium  and  small.     Each  size 
has  a  different  number  of  cars  in 
inventory.     The  manufacturer  also  knows 
that  sales  of  one  type  car  has  a 
significant  effect  on  sales  of  the  other 
two  types.     The  manufacturer  wishes  to 
determine  a  short  term  price  structure 
which  will  allow  him  to  maximize  revenue 
while  turning  a  significant  proportion  of 
his  surplus  inventory. 

Sales  of  each  type  of  car  are  fore- 
casted via  the  following  Cobb-Douglas 
equation : 

sales  of  car  type  i  =  S^= 


K.P, 


11 


i2 


i3 


i  =  1,2,3 


where  p^  are  the  prices  for  the  three  cars. 
Revenue  from  total  sales  is  of  course 

y 

.    S.    P..     Assume  we  know  the  following 
111 

parameters  which  predict  sales: 
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.5 

.3 
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K.  = 
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1x10" 


8x10" 


3x10" 
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48 


and 


Inven tor\'  =  I . 

1 


A5,000 


128,000 


150,000 


25,000 


Minimum 
Desired  Sales 


6A,000 


90,000 


Maximum  and  minimum  prices  as  set  by 
management  are: 


Lower 

Upper 

bound 

bound 

none 

none 

none 

5,000 

1,000 

3,000 

Management  wishes  to  determine  a 
price  for  each  car  which  maximize  net 
short  run  revenue  from  sales. 

The  monomial  model  for  this  problem 
will  be: 


Max.  ap{\-\^-'+  8p^-  -p,  -p/  + 


.45     .5  .4 

'2  P3 


.5     .3     .5,  3 

3p^    P2    P3    )  10 


S.t.    25x10^  £  1x10-^  p^'^^'^Ps"^  1  45x10^ 

3  3       45  -  5      4  3 

64x10    _<  8x10  Pj^    P2    P3      1  128x10 

90x10^  £  3x10-^  p^'^p^'^p^'^  1  150x10-^ 


0  1  Pi 


0  _<  P2  1  5,000 


1,000  1  P3  ^  3,000 


The  algorithm  described  in  this  paper 
took  1.8  seconds  to  solve  the  above  model. 
Prices  and  predicted  sales  are  shown  below. 


Optimum 
Prices,  = 


1 

$9030 

75 

2 

5000 

00 

3 

1526 

94 

Sales  for 
these  prices 


4  5,000 


128,000 


93,921 


Net  revenue  generated  from  this  short  term 

9 

pricing  strategy  will  be  $1,190x10  . 

9,  DISCUSSION 

In  this  paper  we  have  presented  an 
efficient  algorithm  for  the  solution  of 
monomial  problems.     Certain  comments 
should  be  made  here  concerning  the 
algorithm  presented.      (1)   Although  the 
program  of  the  algorithm  was  written  for 
polynomial  objective  functions  this  is  not 
necessary.     Any  di f f erentiable  objective 
could  be  easily  substituted  for  the  one 
chosen  in  this  paper.     The  one  used 
however  is  totally  general  within  the 
framework  of  polynomial  functions. 

(2)     The  program  was  written  to  solve 
both  constrained  and  unconstrained 
variable  problems,    it  could  easily  be 
modified  for  upper  bounded  variables 
without  the  necessity  of  using  them  in  the 
constraints.      (3)   As  was  stated  earlier  in 
the  paper  all  the  typical  linear  programm- 
ing techniques  can  be  easily  modified  to 
fit  within  the  framework  of  monomial 
programs.      (4)   The  algorithm  itself  is 
very  similar  to  Zangwill's  convex  simplex 
method   [9]   and  Wolfe's  reduced  gradient 
m^hod    [7]   and   [8].     However  in  our 
algorithm  no  transformation  are  required. 

(5)   Computational  results  from  this 
algorithm  are  very  encouraging  particular- 
ly in  the  light  of  possible  computational 
improvements  which  can  be  made  within  the 
linear  search  routine. 
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ABSTRACT 


Several  mathematical  programming  formulations  of 
nonlinear  cost  multicommodity  flow  problems  as  well 
as  a  few  of  their  applications  and  solution  algori- 
thms are  considered.    Three  algorithms  are  imple- 
mented and  tested  on  moderate  size  problems;  they 
are  adapted  Frank  and  Wolfe's  method,  Darfemos' 
explicit  path  flow  exchange  procedure  and  speciali- 
zed convex  simplex  algorithm.    Experimental  results 
seem  to  indicate  that  Frank  and  Wolfe's  method  is 
sufficiently  competitive  when  some  what  less  preci- 
se solutions  are  accepted.    The  specialized  convex 
simplex  algorithm  is  particularly  suited  to  "post- 
optimization"  tasks  and  is  computationally  superior 
to  the  explicit  path  flow  exchange  procedure. 

1.    THE  PROBLEM  AND  ITS  APPLICATIONS 

We  consider  the  following  nonlinear  cost  multi- 
commodity  flow  problem.    Given  a  network  with  n  no- 
des, let  Cl,2,...,p},  p  S  n,  denote  the  set  of 
nodes  which  are  either  supply  points  (sources,  ori- 
gins) or  demand  points  (sinks,  destinations),  or 
both.    In  addition,  every  node  in  the  network  can 
be  used  as  transhipment  point.    Let  A  be  the  set 
of  arcs  in  the  network  and  D  denote  a  p  x  p  matrix 
whose  (i,  j)  entry  indicates  the  nonnegative  number 
of  units  which  must  flow  between  nodes  i  and  j. 
Let  the  cost  of  shipping  x. .  units  along  (i,  j)  be 
f .  .(x.  .),  where  the  cost  is''a  function  of  the  total 
fiJw  aiong  the  arc.    Let  us  use  x^. .  to  denote  flow 
along  arc  (i,j)  with  origin  s,  the''"Aalticommodity 
flow  problem  which  we  address  is  then 

(1)^ 

Minimize     f(X)=    S       f..(x.  .)=    S       f .  .(I)  x^.  .) 

(i,j)eA^^  (i,j)€A^Js=l 


subject  to 


D(s,j)  +     I     x%    =      I       ^  (2) 
k=l        J"  i=l 


(j,k)€A 


(i,j)€A 


for  all  j  =  1,  2,  m,     s  =  1,  2,  p,    j  7^  s 

X  _  s  0  for  all  (i,j)€  A,  s  =  1,  2,.,.,  p 

We  ranark  that  (l)-(3)  represent  what  we  call  many 
(destinations)  to  one  (origin)  node-arc  formulation 


of  the  multicommodity  flow  problem,    A  more  intuiti- 
ve form  of  the  flow  conversation  equations  (2^  can 
be  obtained  by  considering  separately  flow  x    . . 
along  arc  (i,j)  and  associated  with  origin  s  and 
destination  t.    We  have  then  the  one  (destination) 
to  one  (origin)  node-arc  formulation  of  the  multi- 
commodity  flow  problem. 


Minimize 


f,J  Ex'^  J 


f (X)  =    E       f . .(x. .)  =  E 

(i,j)€A  (i,j)6A  s,t 


(4) 


subject  to 


st 


st 


2  _  -  _ 

i^  (i,j)€A^    ij~  k^(j,k)f=A  jk 

-D(s,t),j  =  s,  j  =  1,  2,  n 
=  <        0,     j  7^  s,t  s,t  =1,  p 
D(s,t),  j  =  t,  t  ^  s 


(5) 


x^^^jSO  for  all  (i,j)6A,  t,  s  =  1, 2, , ,.  ,p       s  (6) 

However,  the  most  intuitive  formulation  of  the  pro- 
blem considered  is  perhaps  the  arc-chain  formulation. 
This  formulation  is  derived  by  providing,  for  each 
(s,t)  commodity,  a  set  P^-^  of  paths  from  node  s  to 

node  t  and  path  flows  x       for  all  q  6  P 

^  q  st 


Minimize 

f(X)  =    E        f..  (  E     a^^*?  x^*  ) 
(i,j)6A  s,t,q 


subject  to 

E    x^*  =  D(s,t)  all  s,t  =  1,2, ...,p,  s  7^  t 


q€P 


st 


st 


^  0  all  q€P    ,  s,t  =  1,2, ...,p,  s  ^  t 


st 


(7) 


(8) 


(9) 


where 

stq  fi,  if  arc  (i,j)  belongs  to  path  q  €  P 
^ij  =  '^10) 

lO,  otherwise. 
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The  nonlinear  multicommodity  network  flow  for- 
mulations presented  above  have  been  used  as  mathe- 
matical programming  models  for  interesting  and 
practical  problems  in  transportation  networks,  in 
packet  switched  computer  communications  networks, 
etc.    We  call  the  reader's  attention  to  two  specific 
applications:  the  equilibrium  traffic  assignment 
problem  in  a  road  network  [1,2,3,17,19,20]  and  the 
optimal  static  routing  problem  in  a  computer  commu- 
nications network  [8,13],    A  more  general  multicom- 
modity flow  problem  based  on  the  arc-chain  formula- 
tion mentioned  above  and  including  continuous  arc 
capacity  augmentation  can  be  found  in  Reference 
[14],  where  arc  capacities  are  explicitly  accounted 
for  and  the  total  cost  is  not  necessarily  a  function 
separable  or  not,  of  only  total  flows  on  each  arc. 


defines  a  descent  direction  with  respect  to  X  , 
provided  that  X*^  is  not  an  optimal  point. 

The  first  specialized  version  of  the  feasible 
direction  method  for  solving  the  nonlinear  cost 
multicommodity  flow  problem  can  be  obtained  by 
observing  that  in  this  particular  case  the  L.P. 
problem  (11)  is  equivalent  to  computing  shortest 
path  flow  with  respect  to  the  distances 


for  all  (i,j)  e  A 


X.  .  =  X  . 


The  specialized  algorithm  obtained  [2,8,17]  can  be 
stated  as  follows. 


2.    THE  SOLUTION  ALGORITHMS 

It  is  possible  to  present  various  algorithms  for 
solving  the  nonlinear  cost  multicommodity  flow  pro- 
blem considered  within  the  general  framework  of 
either  Zoutendijk's  feasible  direction  method  [19] 
or  the  Conditional  Gradient  Method  [14].    Since  our 
principal  interest  is  testing  specialized  algorithms, 
we  choose  to  restrict  ourselves  to  the  first  alter- 
native.   Then,  the  feasible  direction  method  for 
solving  the  problem  of  minimizing  a  continuously 
dif f erentiable  function  f  from  R^^  into  R  over  a  con- 
vex polyhedral  set  F  =  [X^R^  :  G^^  X  =  0,  X  s  o] 
can  be  stated  as  follows.  ^ 

Algorithm  A-O 

(1)  Find  an  initial  feasible  solution  X°€F.  Set 
k  =  0. 

(2)  If  X    does  not  satisfy  the  stopping  criteria, 
then  go  to  (3). 

(3)  F^d  a  descent  direction  d    with  respect  to 
X  ,  i.e.  a  vector  d    €  R*^  such  that 

(X*^  +  td"^)  €  F,  for  all  t  6  [O,  a], 

f{X^  +  td^)  <  fiX^),  for  all  t  €  (O,  y) 

(4)  Find  a  solution  t    to  the  one-dimension  search 
problem 

Minimize  {f (x"^  +  td"^)   |te  [o,  a]] 

k+1        k        k  k 

(5)  Set  X       =  X    +  t  d  ,  k  =  k  +  1,  and  go  to  (2). 

A  close  examination  of  the  algorithm  A-0  revesils 
that  the  determination  of  descent  directions  plays 
on  important  role  in  characterizing  different 
specialized  versions  of  the  feasible  direction  me- 
thod as  applied  to  various  problems.    A  simple  way 
to  obtain  such  a  direction,  which  consists  in  linea- 
rizing the  objective  function,  was  suggested  by 
Frank  and  Wolfe  [7].    In  fact,  let  y^  be  an  optimal 
solution  of  the  following  linear  programming  problem 


Algorithm  A-1 

(1)  Let  the  current  feasible  solution  be 

x""  =  (x*"..   I  (i,j)  €  A) 
df .  . 

(2)  Compute  C^    =  ^  and 

ij    X.  .=  x  .  . 

set         =  0  for  all  (i,j)  €  A.  Set  s  =  1. 

(3)  Find  the  shortest  path  between  node  s  and 
every  node  in  the  network  using  the  C.    's  as 
distances, 

(4)  For  each  destination  t  in  the  network,  set 
y^j  =  y^^  +  D(s,t)  for  every  arc  (i,j)  on  the 

shortest  path  between  node  s  euad  node  t, 

(5)  If  s  is  not  equal  to  the  number  of  origins, 
set  s  =  s  +  1  and  go  to  step  3;  otherwise  go 
to  step  6, 

k 

(6)  Let  y    =  (y. .|(i,j)€A).    Minimize  the  function 
f  along  the4ine  segment  between  X**  and  y^, 
using  a  one-dimensional  search  technique, 

k+1 

Let  the  minimizing  point  be  X  . 

(7)  Test  the  stopping  criterion,  and  go  to  step  2 
if  it  f ail sj otherwise,  stop. 

Another  way  of  obtaining  a  descent  direction  at 
a  feasible  solution  X  is  described  below.    It  is 
more  involved  than  Frank  and  Wolfe's  procedure, 
and  very  similar  to  the  simplex  algorithm  for  L.P, 
problems. 

Let  Q  be  a  basis  of  the  constraint  coefficient 
matrix  G.    By  partitionning  the  variables  into 
basic  and  non-basic  sets,  indexed  by  I  and  J  respec- 
tively, and  renumbering  them  if  necessary,  one  can 
choose  as  a  feasible  direction  any  direction 


d  =  (dj,dj) 


Minimize  [f  (X*")  +  Vf  (x'')'^(y-x'')   1  y  €  F]  (11) 


satisfying  the   following  conditions; 


then. 


,k  k  k 
d    =  y    -  X 
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and 


=  -[Q~  R]dj 

d.  s  0    for  all  j  €  J  such  that  x.  =  0 

^  ^  (12) 

d.  s  0    for  all  i  €  I  such  that  x.  =  0 

X  X 


[Vf  (X)j  -  (Q  Vf  (X)jf  dj  <  0  (13) 


Choose 


r^  =  0 

rj  =  Vf (X)j  -  (Q~-'-R)'^  Vf (X)^ 


r.    =  min  [r . ] 
""t.  i 


r .    X.    =  max  fr.  x. } 

i„    1^        ■  11 
2      2  1 

one  can  obtain  either  a  descent  direction  of  redu- 
ced gradient  type 


r-r.,  if  r  <  0, 
j      L~''"j''^j'  otherwise;    for  all  jSj 


(14) 


dj  =  -(Q~  R)  dj 

or  a  descent  direction  of  convex  simplex  type 

[1  ifj=i^ 
[O      if  j  €  J,  j  7^  i^ 

-1      if  j  =  i. 


d  .  = 


d  .  = 


0      if  j  €  J,  j  7^  i. 


d  ,  if    r.      >  r.  x. 


d,  = 


II 

d  J  otherwise 


-(Q  -"^R)  d. 


^2  ^2 


(15) 


under  the  following  assumption: 

(a)  >  0     (the  basis  Q  is  non-degenerate) 
or 

(b)  for  every  i  £  I,  —  <  ~-     for  all  j  €  J. 

ox ,       ox . 

1  J 

In  the  following  algorithm  we  will  restrict 
ourselves  to  the  reduced  gradient  descent  direction 
and  the  arc-chain  formulation  since     the  convex 
simplex  descent  direction  is  a  special  case  of  the 
reduced  gradient  descent  direction.    This  algorithm 
solves  a  series  of  restricted  problems  defined  over 
a  series  of  restricted  sets  of  allowed  paths.  It 


uses  a  sequential  optimization  approach  obtained  by 
decomposing  a  restricted  problem  into  a  series  of 
subproblems,  each  corresponds  to  an  origin-destina- 
tion (O-D)  pair.    Successive  restricted  problems 
are  obtained  by  the  path  (column)  generation  tech- 
nique described  below  [16,19], 

Algorithm  A-2 

(l)    Let  be  given  for  each  O-D  pair  (s,t)  associa- 
ted with  a  positive  D(s,t)  a  restricted  set 

of  allowed  paths  P°    ^  from  node  s  to  node  t, 
s,  t  ^ 

and  an  arbitrary  path  flow  {h      jqGP  } 

Set  -1  =  0,  *^ 


(2)    Solve  the  restricted  multicommodity  flow  pro- 
blem over  the  restricted  set  of  allowed  paths 

=,V     .  P^'         as  follows, 
(s,t)  s,t 

(2.1)  Seti^=l,  i^p,  =0. 

(2.2)  If  i        is  equal  to  the  total  number  of 

opt 


O-D  pairs  M,  go  to  (3).  Otherwise, 
set  i^ 

1  =  1. 


set  i^  =  i^  +  1,  k  =  i^  modulo  (M)  +1, 


(2.3)    Solve  the  subproblem  associated  with  the 
k^h  O-D  pair  (s,t)  as  follows. 


(2. 3, a)  Compute 

df .  .(x,  J 


ij 


1.1  1.1 
dx.  . 
ij 


for  all  (i,j)  e  A, 


(■  ^■^c^  a^^'i'  c.  .  for  all  q€P^  ^, 
(x,j)€A       ij      xj  ^  s,t' 


mxn 
q  e  P 


st 


(2,3.b)  Compute 


(u    -  u  )  h^*  ,  if  h  >€  and 


Iv 


d  =\ 
q 


b        q'  q 
0,  otherwise. 

E 


q  1 


>  e 


2, 


^^^s,t 


(2.3,c)  If       >  0,  then  go  to  (2.3,d).  Other- 

ths  in  P  ^  sati 
Lons.    T&en,  if 
+  1  and  go  to 


wise,  the  flows  on  paths  in  P  ^  satisfy 
the  optimality  conditions.    T&en,  if 


-L  <  1  let  i    ^  =  X  ^ 
opt  opt 

(2,2);  otherwise,  i.e.  1^2,  let 

i       =1  and  go  to  (2.2). 

opt 


(2.3.d)    Search  for  tr---  minimizing 
E         f . .(x. .  +  tw. .) 
(i,j)€A      iJ    iJ  iJ 


53 


where 


and 


w. .  -    ,Z     a  . .  d 
xj       q€{*^^       IJ  q 


0  <  t  <  min{- 


st 

1 

— ^  I  q  €  P  d  <  0} 
a  St  q 


(2)  then  an  optimal  solution  is  attained.  Other- 
wise, let  i^  =  i^  +  1  and  k  =  i^  modulo  (N^)  +  1. 

(3)  Solve  the  flow  problem  associated  with  the 
origin  node  k  as  follows. 

(a)  Let  t  =  0  and  compute  the  distances 


(2.3.e)    Let  h®^    =  h^^    +  r-d  ,  all  q  €  P  , 

q       q        q  st' 

X.  .    =  X.  .  +  t"-  w.  .,  all  (i,i)  €  A, 

t      =  -t  +  1,  and  go  to  (2. 3. a). 

(3)    For  every  0-D  pair  (s,t),  find  a  shortest 

path  b      from  s  to  t  with  respect  to  the 

, .  st 

dxstances 

df . .     (X. . )  TT      C •      • \     C  A 

c.  .  =      1.1      1.1      for  all  €  A. 

^■^  dx.  ' 


Let  u      be  the  length  of  b    ,  and  z      be  the 

ft  st  st 

ength  for  all  significantly  positive  flow 

paths  in  F*^  ^. 
^  st 


Set 


st 


,-t,+l 


P^     U  {b    },  if  ^—  ^> 

st  st  '  z  .  2' 


st 


P"  ,  otherwise. 
^   st  ' 


U  P 
s,t 


-t+1 
st 


-t.+l  -t 

If  P       CP    then  stop;  otherwise,  set  1=1  +  1 
and  go  to  (2). 

Close  examination  of  Algorithm  A-2  reveals  that 
it  would  be  a  cumbersome,  if  not  clumsy,  book- 
keeping task  to  store  all  positive  flow  paths,  to 
check  and  to  update  all  path  flows,  separately. 
Storage  requirements  could  be  fantastically  large, 
even  for  mass  storage  devices,  if  the  networks 
considered  are  of  considerable  size.  Fortunately, 
there  exists  an  equivalent,  but  very  compact  repre- 
sentation of  path  flows  based  on  the  node-arc 
formulation  of  multicommodity  flow  problems.  Moreo- 
ver, time-consuming  shortest  path  calculations  can 
be  avoided  completely  by  using  a  generalized  version 
of  the  augmented  precessor  index  method  [9]  to  solve 
the  flow  subproblem  associated  with  an  origin  node 
k,  while  maintaining  all  other  flows  constant.  These 
considerations  have  given  rise  to  the  following 
algorithm  [20], 

Algorithm  A-3 

(1)  For  every  origin  node  k,  find  arbitrary  flow 
(x'5.,   (i,j)  ^  A)  and  spanning  arborescence 

E(kj  rooted  at  k.    Let  i„  =  -1  and  i  =0, 

T  opt 

(2)  If  i       is  equal  to  the  number  of  origins  N„, 

opt  S 


df. .(x. .) 

c^.  =  for  all  (i,j)  6  A, 

ij 


(b)  Compute  for  each  node  i  7^  k  the  length  of 

the  path  J,  .  from  k  to  i  in  E(k) :  z.=,    E.^,      c  . 

ki  '  '      1  (u,v)€j|^  uv 

Compute  for  each  arc  (i,j)  ^  E(k)  the  reduced  length 


r..=c..+z.  -z. 


(c)  Define 

=  [(i,j)  ^  E(k)|  ^  <  -€^} 
i 

I  =  [(i,j)  ^  m)\^>^2  ^""^  \]  > 


If  either  I.^^  or  I^  ^s  nonempty     go  to  (d). 

Otherwise;  the  flows  x. .originating  at  k  satisfy 

the  optimality  conditiotls.    Thus,  if     <  0  then 

let       ^  ~  •'-0  ^      -'-         SO  to  (2);  Otherwise, 

let  i°^    =  l°Ind  go  to  (2), 
opt 


(d)    Determine  (^-j^'^l^  ^''■2'^2^  such  that 

in[ 
h 


k  k 
r.  minfr     ]  and  r.    .    x.    .  =  max[r    x  ], 

1^1      3^     uv  ^2^2    V2  I2 


If   |r.    .    I  ^  r         X  ,  then  let  (i,  j)=(i,,j,) 

and  a  =  +1;    Otherwise  let  (i,  j)  =  (i  ,  j  )  and 
01  =  -1,0, 

Retrace  the  cycle 


L.  j=(j,.U{(i,J)}UJ,j=L..UL=j 


Let 


d  =< 
uv 


f  0     (u,v)  ^  Lr  J 

a  (u,v)  e  L+T- J  for  all  (u,v)  €  A, 
-01    (u,v)  e  L  T  T 


1  J 


(e)  Determine 


minfx      |(u,v)  €  L  -  t]  if    a  =  1 
uv  1  j 


k  + 
minfx       (u,v)  6  L  r  t]  if  q-  =  -1 
^  uv   '     '  1  J 

If  a  >  go  to  (g);  otherwise,  perform  the  fol- 
lowing pivoting  operation. 
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(f)  Let  E(k)  =  E(k)U{(u,v)}-{(i3,  j  )}  where 
(      (u,v)  is  an  arbitrary  element  of  the  set 

I    =  [(u,v)  i  E(k)    I  V  =  j    and  xjj     >  1 

(if        is  empty  replace  (i^,  j^)  by  its  sucessor  on 

L  -  -    or  Lt-  T  and  determine  again  a  new  I^,  and 

1  J  1  3  3' 

so  on)  and  go  to  (b). 

(g)  Find  t"""  minimizing 

E  f . .(x. .  +  td. .)  for  t  €  [0,  a] 

(i,j)  €L.- 


1  J 


(h)  Let 


X      =  X      +  tr""d  , 
uv       uv  uv 

X      =  X      +  t'"-d  , 
uv       uv  uv 


c      =  f     (x    ),  for  all  (u.v)  €  L-  t 

uv        uv    uv  1  J 


Set  -L  =  -f,  +  1  and  go  to  (b). 


computing  time  per  iteration  of  Algorithm  Al  is 
independent  of  the  number  of  0-D  pairs.  Conse- 
quently, for  the  Sioux  Falls'  network,  we  solved 
only  one  test  problem  with  528  0-D  pairs.  Moreover 
to  reveal  the  convergence  speed  of  algorithm  Al,  we 
have  plotted  in  Figure  1  the  value  of  the  objective 
function,  maximum  percentage  change  in  flow  compo- 
nents, and  computing  time  as  functions  of  the 
number  of  iterations. 

A  convergence  difficulty  inherent  to  Frank  and 
Wolfe's  algorithm  occurs  when  the  objective  functi- 
on assumes  a  minimum  point  in  the  interior  of  the 
feasible  region.    In  such  a  case,  the  direction 
indicated  by  the  gradient  may  change  veiy  rapidly 
as  the  sequence  of  points  approaches  the  optimal 
solution.    Consequently,  the  points  may  zig  zag 
around  the  optimal  solution.     In  our  test  problems, 
although  the  gradient  of  the  objective  function 
can  not  be  zero  at  the  optimal  solution  (since  the 
objective  function  is  strictly  increasing  and  has 
no  stationary  point  in  the  feasible  region)  the 
convergence  seems  to  slow  down  appreciably  when 
approaching  the  optimal  solution. 


3.     ALGORITHM  IMPLEMENTATION,  TESTING  AND  USE 

In  this  section  we  discuss  the  implementation 
of  a  basic  version  for  each  of  the  algorithms  Al, 
A2,  and  A3  presented  in  Section  2.    Basic  versions 
of  the  algorithms  were  programmed  and  tested  on 
relatively  moderate  size  problems.    The  tests  were 
planned  to  gain  insight  into  the  computational 
behavior  of  the  algorithms. 

(3. a)    For  test  problems  on  networks  with  about  a 
hundred  nodes  and  a  thousand  arcs,  the  implementa- 
tion of  algorithm  Al  does  not  present  any  serious 
problem.    At  iteration  k  of  the  algorithm  it  is 
necessary  to  find  the  shortest  paths  for  all  0-D 
pairs  with  respect  to  the  distances  equal  to  the 
marginal  costs  of  traveling  on  arcs  at  current 
flows  X^.    Shortest  path  flows  y'^  are  obtained  by 
sending  for  every  0-D  pair  (s,t)  the  flow  requi- 
rement D(s,t)  along  the  shortest  path  joining  s 
and  t.    Floyd's  algorithm  [5]  has  been  used  to 
determine  the  shortest  paths  between  all  pairs  of 
nodes  of  a  network.     This  choice  has  been  made  in 
order  to  facilitate  programming  tasks.  Computing 
time  could  be  significantly  reduced  by  repeatly 
using  Dijkstra's  algorithm  [4],  especially  when  the 
number  of  origins  N^  is  much  smaller  than  the  to- 
tal number  of  nodes.    The  Golden  Section  method 
[15]  has  been  used  for  searching  a  minimum  of  the 
function  f  along  the  line  segment  [x'^,  y'^].  The 
algorithm  is  started  from  the  initial  shortest 
path  flows  with  respect  to  any  estimated  distances. 

Two  test  networks  are  considered:  one  is  the  76 
arc,  24  node  network  of  the  City  of  Sioux  Falls, 
South  Dakota  [17],  and  the  other  is  the  376  arc, 
155  node  network  of  the  City  of  Hull,  Canada  [19]. 
Global  computational  results  are  presented  in  Table 
1  for  comparison  with  the  algorithms  A2  and  A3. 
We  remark  that  for  the  basic  version  implemented, 


(3.b)    The  basic  version  of  algorithm  A2  is  essen- 
tially the  computer  program  given  in  [16],  imple- 
menting Dafermos'  algorithm  using  a  column  genera- 
tion technique.    In  this  variant  of  algorithm  A2 
the  descent  direction  in  (2.3  b)  is  actually  repla- 
ced by  the  following 

u    =  min  (u        q  €  P  ) 
a  q    '  St 


q 

u,   =  max  (u     I  q  €  P"^  ,  h^^  >  O) 
D  q  '  St  q 


d  = 


1 
-1 


d    =  0  for  all  q  P  P    ,  q  7^  a,  q  7^  b 
q  St  5   -1  ' 

An  appropriate  move  along  d  means  shifting  flow 
from  the  maxim\am  cost  path  to  the  minimum  cost 
path  in  order  to  equalize  travel  costs  on  these 
paths. 

The  program  has  been  modified  in  order  to  acco- 
modate much  more  0-D  pairs  than  the  maximum  number 
of  20  originally  allowed.    These  minor  modifica- 
tions are  straight -forward  and  consist  in  reducing 
the  core  storage  required  to  memorize  path  and 
path  flow  data.    This  reduction  is  easily  achieved 
by  performing  flow  adjustments  for  one  0-D  pair  at 
a  time.    Only  data  associated  with  the  0-D  pair 
considered  need  to  be  in  core.    When  all  path  flows 
for  this  0-D  pair  satisfy  the  optimality  conditions, 
updated  data  are  written  on  disks  for  latter  uses. 

(3.c)    The  essential  part  of  algorithm  A3  consists 
in  solving  an  one-commodity  flow  problem  associated 
with  an  origin  node.    This  is  achieved  by  a  gene- 
ralized version  of  the  augmented  predecessor  index 
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method.    For  strongly  connected  networks,  it  is  al- 
so possible  to  work  only  with  a  basis  which  can  be 
represented  by  a  spanning  arborescence  E(k)  rooted 
at  origin  node  k.    For  the  arborescence  representa- 
tion, we  use  triple-label  notation  [12]  in  which 
each  liode  i  receives  three  labels 

-  child  (i)  pointing  to  first  node  j  such  that 
(i,j)  €  E(k), 

-  parent  (i)  pointing  to  the  node  j  for  which 
(j,i)  e  E(k),  and 

-  sibly  (i)  pointing  to  node  j  7^  i  such  that 
(parent  (i),  j)  €  E(k). 

Node  numbers  z.  are  then  readily  obtained  by 
traversing  arborescence  E(k)  in  end-order.  Reduced 
arc  lengths  computed  from  node  numbers  and  marginal 
costs  permit  us  to  verify  the  optimality  conditions, 
and  to  choose  an  out-of -kilter  arc  for  which  some 
flow  adjustment  is  performed. 

Let  (i,  j)  be  the  out-of -kilter  arc  considered, 
and  let  us  suppose  that  the  reduced  length  rr  t  is 
negative.    Thus,  flow  adjustment  consists  in"'"t?'ans- 
fering  flow  from  (q,  r,  s,  j)  to  (q,  t,  i)  to 
reduce  r-  t  to  zero.    The  same  adjustment  process 
as  that  us^d  in  [16],  i.e.  an  interval  halving 
method,  is  adopted,  since  in  this  process  no  itera- 
tions are  required  if  integral  (lump)  flow  transfer 
occurs.    This  speeds  up  the  computation,  expecially 
at  the  beginning  of  the  solution  process,  and  gives 
better  results  than  the  golden  section  method. 


Basis  changes  are  frequently  required.    To  per- 
form a  basis  change,  a  zero  flow  arc  (r,s)  in  E(k) 
is  replaced  by  a  positive  flow  arc  (u,s).    Such  an 
arc  (u,s)  exists  necessarily,  if  flow  on  (s,j)  is 
positive.    Otherwise,  (s,j)  is  considered  instead 
of  (r,s),  and  can  in  fact  be  replaced  by  (i,j). 
After  a  basis  change,  if  arc  (u,v)  is  inserted  into 
the  new  arborescence,  then  only  sub-arborescence 
rooted  at  u  needs  to  be  considered  for  purpose  of 
updating  node  numbers. 

To  start  the  algorithm,  initial  arborescences 
rooted  at  different  origin  nodes  as  well  as  arbitra- 
ry initial  flows  are  required.    We  calculate  the 
shortest  path  from  an  origin  to  every  node  in  the 
network  by  means  of  Dijkstra's  algorithm.    At  the 
end  of  the  shortest  path  calculation  we  obtain  the 
shortest  path  flows,  and  for  each  node  the  shortest 
distance  as  well  as  its  predecessor  node  on  the 
shortest  path  from  the  origin  to  the  node  itself. 


This  does  mean  that  we  have  for  each  origin  an 
arborescence  represented  in  predecessor  index  nota- 
tion, which  is  easily  transformed  into  triple-label 
notation  and  used  for  basis  representation.  The 
shortest  path  flows  obtained  are  used  as  initial 
flows, 

(3.d)    We  will  now  discuss  experimental  results 
obtained  from  solution  of  the  test  problems  by 
means  of  algorithms  A-1,  A-2,  and  A-3. 

First,  it  is  clear  from  the  statement  of  the 
algorithms  and  presentation  of  the  basic  versions 
implemented  of  that  algorithms  A-2  and  A-3  are  quite 
similar.    For  both  algorithms  we  may  interpret  the 
underlying  itterative  computational  process  within 
the  framework  of  traffic  equilibrium  as  a  process 
by  which  rational  drivers  acquire  knowledge  about 
their  travel  cost,  switch  to  cheaper  routes,  conges- 
ting them,  and  causing  a  new  flow  pattern  to  en- 
volve.    This  interpretation  gives  rise  to  a  realis- 
tic stopping  criterion  for  algorithms  A-2  and  A-3, 
which  requires  that  the  most  expensive  and  the 
cheapest  used  paths  as  well  as  the  cheapest  unused 
path    from  an  origin  to  a  destination  must  not 
differ  more  than,  say,  5%  in  travel  cost.  Such 
stopping  criteria  may  also  be  used  to  generate 
solutions  simulating  a  well  accepted  fact  that  c/fo 
of  drivers  do  not  care  for  differences  of  ^%  in 
travel  cost  between  paths  from  their  origin  to 
their  destination. 

Second,  although  flow  adjustments  in  algorithms 
A2  and  A3  are  quite  similar,  algorithm  A3  uses  a 
more  compact  representation  of  flows  without  ex- 
plicitly storing  all  paths  allowed.    Thus,  we  may 
expect  algorithm  A3  to  provide  better  performances 
than  algorithm  A2  in  terms  of  core  and  mass  storage 
requirements  as    well  as  computing  time  since  less 
book-keeping  is  necessary  and  the  shortest  path 
calculation  avoided.    The  experimental  results  ob- 
tained confirm  this  conjecture  in  a  very  striking 
manner.    Figure  2  shows  a  relation  between  compu- 
ting time  and  number  of  0-D  pairs  for  a  series  of 
4  Sioux  Falls  network  test  problems  presented  in 
Table  1.    The  computing  time  of  algorithm  A2  is 
seven  times  greater  than  that  of  A3.    All  runs  were 
executed  on  an  IBM/System  360-  Model  50  with  Ampex 
ECS  under  OS-MVT,  using  Fortran  IV  Level  G  Compiler. 

Third,  the  computational  behavior  of  algorithm 
Al  is  significantly  different  from  that  of  algo- 
rithms A2  and  A3.    Defining  an  iteration  as  a 
shortest  path  flow  calculation  (essentially  an  ex- 
ecution of  Floyd's  algorithm)  followed  by  an  one- 
dimensional  search,  the  value    of  the  objective 
function,  maximum  percentage  change  in  flow  compo- 
nents, and  computing  time  are  shoivTi  iteration  by 
iteration  in  Figure  1,    The  results  indicate  that 
improvement  per  iteration  is  very  substantial  at 
the  beginning  of  the  iterative  process,  but  that 
the  rate  of  convergence  slows  when  approaching  the 
optimal  solution.    This  means  that  increasing  the 
precision  of  the  solutions  is  atime  consuming  task. 
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Such  a  characteristic  couU  be    ennoying  for  people 
attempting  to  use  algorithm  Al  in  selecting  net- 
work topology  or  network  improvement  projects. 

(3.e)    We  are  in  fact  searching  for  an  efficient 
network  configuration  evaluation  procedure  to  be 
used  in  selecting  network  improvement  projects  sub- 
ject to  a  budget  constraint.    For  a  regional  road 
network,  where  congestion  may  reasonably  be  negli- 
gible, we  have  proved  a  superadditive  property  of 
a  natural  objective  function  and  devised  a  power- 
ful branch  search  procedure  [10],  which  was  later 
greatly  improved  and  reported  in  [5],  Interesting 
results  presented  in  [5]  also  show  that  heuristics, 
related  to  those  suggested  in  [10],  give  solutions 
extremely  close  to  the  optimal  ones.    We  are 
presently  developping  similar  procedures  for  net- 
works where  congestion  is  no  longer  negligible  [11] 
In  such  a  case  it  is  necessary  to  perform  post- 
optimality  analysis  by  removing  a  link  from  or  ad- 
ding a  link  to  a  network  configuration.  Algorithm 
A3  seems  particularly  suited  to  this  reoptimiza— 
tion  task.    Table  2  shows  computing  time  required 
as  function  of  link  insertions/deletions, 
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Table  1  -    Computational  results  for  the  algorithms  A-1,  A-2,  and  A-3: 
(Computing  time  in  seconds,  normalized  final  value  of  the 
objective  function). 
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Figure  l.a.-  Computational  indices  vs  iteration 
number  (Sioux  Falls  Network,  Algorithm  Al). 
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Figure  2-  Computing  time  vs  number  of  0-D  pairs. 


Figure  l.b.-  Computational  indices  vs  iteration 
number  (Hull  Network,  Algorithm  Al). 


Number  of 
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modifications 

Average  computing 
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Table  2  -    Network  reoptimization 
after  link  modifications. 
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ABSTRACT 


In  certain  linear  programs,   for  example,  those 
arising  out  of  production-inventory  type  problems, 
a  large  number  of  constraints  have  a  special  form. 
Examples  are:     simple  upper  bounds,  generalized 
upper  bounds,  variable  upper  bounds,  and  general- 
ized variable  upper  bounds.     This  paper  identifies 
a  class  of  constraints  which  enables  us  to  obtain 
a  large  triangular  submatrix  in  any  feasible  basis. 
Also  presented  is  a  method  for  representing  these 
constraints  implicitly  within  a  modified  version 
of  the  revised  simplex  method. 

1 .  Introduction 

From  the  early  fifties  on,  there  has  been  a  steady 
progression  of  optimization  strategies  and  tactics 
for  solving  large  linear  programs . 1     Such  large 
LP's  are  common  in  problems  in  production  and 
operation  management.     Examples  are  those  derived 
from  mixed  integer  linear  programs  arising  out  of 
such  problems  as  production-inventory  planning 
and  facilities  location.     Consideration  of  multi- 
periods  and/or  multi-products  increases  the  size 
of  these  problems  substantially.     Another  factor 
contributing  to  the  increase  in  size  in  some 
problems  is  the  possibility  of  tighter  formula- 
tions of  integer  programs  as  opposed  to  loose 
formulations.     Importance  of  such  tighter 
formulations  is  well  explained  by  Schrage 
(1975b) . 

One  area  of  research  on  large  LP's  is  that  of 
tactical  variations  in  pricing  and  pivot 
selection.     Another  area,  which  is  the  area  of 
our  interest,  is  the  exploitation  of  special 
structures  to  reduce  storage  and  computation 
time  by  avoiding  explicit  consideration  of  a 
full  inverse.     Examples  of  such  special  struc- 
ture constraints  which  have  been  represented 
implicitly  are  simple  upper  bounds   (SUB) , 
generalized  upper  bounds   (GUB) ,  and  more 
recently  variable  upper  bounds   (VUB)    (cf .  Schrage 
(1975a))  ,  and  generalized  variable  upper  bounds 
(GVUB) ,  cf . ,  Schrage   (1975b).     Mention  should  also 
be  made  of  angular  structures  and  factorizations 
described  in  Lasdon  (1970)  and  Graves  and  MacBride 
(1973)  . 


This  paper  identifies  a  class  of  constraints 
appearing  in  certain  dynamic  production  management 
problems  which  allow  us,  by  judicious  row  and 
column  permutation,  to  obtain  a  large  lower  trian- 
gular submatrix  in  any  feasible  basis  of  the  prob- 
lem.      Section  2  introduces  these  constraints  and 
illustrates  with  examples  instances  where  such 
constraints  occur.     Section  3  discusses  the 
importance  of  having  a  large  lower  triangular  sub- 
matrix.     Section  4  outlines  the  detailed  develop- 
ment of  a  solution  method  based  on  a  modified 
version  of  the  revised  simplex  method.  Implement- 
ation considerations  currently  underway  are  also 
discussed . 

2 .  Triangularity  Constraints 

In  this  section  we  define  triangularity  constraints 
and  also  give  some  examples  of  where  they  arise. 
First  define 

Eligible  set  of  row  i  =  the  variables  - 
including  slacks  -  that  have  a  strictly  posi- 
tive coefficient  on  the  LHS  of  row  i;  i.e. 

[x  .    :  a.  .   >  O] . 

3  ID 

Now  we  define  triangularity  constraints. 

A  set  of  constraints  of  a  linear  program  is 
called  triangularity  constraints  -  -  or  T 
constraints  -  -  if 

(a)  their  RHS  are  nonnegative,  and 

(b)  the  set  of  constraints  can  be  per- 
muted so  that  in  a  feasible  basis 
to  the  problem,  a  variable  in  the 
eligible  set  of  any  T  constraint 
does  not  have  a  nonzero  element  in 
any  other  T  constraint  lying  above 
this  constraint. 

Example  1 :     One  Machine,  Multiproduct,  Multiperiod 
Scheduling  Problem  with  Setups. 
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For  a  recent  review  of  some  of  this  research  see 
Beale   (1975)  and  also  the  preface  of  the  same 
pxiblication. 
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straints  is  that  we  are  able  to  get  any  feasible 
basis  into  the  form: 


P 

I 
p=l 


y.  < 
ip  - 


all 


(3) 


(8) 


z.  >  y.  -  y .  ,  all  i,  p 
ip  -    ip       1-1, p 


(or,     y.     -  y.    ,       -  z.     <  0) 
ip        1-1, p        ip  - 


(4)T 


where  T  is  composed  of  constraints  (4)  and   (5) , 
and  is  lower  triangular.     Note  that  in  order  to 
get  this  form  we  had  to  place   (4)   and   (5)   in  that 
order  at  the  bottom  of  the  constraints. 


X. .     <  y.  all     i,  j ,  p  (5)T 

i:p  -  iP 


Example  2:  Dynamic  Plant  Location  Problem  (Rood- 
man  and  Schwarz,  1974) 
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IP 


ip 
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1  if  a  setup  is  made  in  period  i 
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10  else. 


Note  that  the  LP  relevant  to  our  discussion  is 
the  relaxation:     Min   (1),   s.t.    (2)   -   (6).  The 
letter  T  against  constraints   (4)   and  (5)  indicates 
that  they  are  T  constraints.     Now  consider  these 
two  sets  of  constraints  together. 

For  any  RHS  constant  which  is  zero,  without  loss 
we  can  assume  it  to  be  a  small  positive  number. 
Thus  one  of  the  variables  with  a  positive  coeffi- 
cient in  the  LHS  of  each  constraint  of  (4)  and  (5) 
must  be  in  a  feasible  basis.     For  these  variables 
of   (5)  ,  there  are  no  other  nonzero  elements  in 
their  columns  within   (4)  and   (5) .     This  is  no 
surprise  as   (5)   is  in  fact  of  VUB  form.     In   (4) , 

the    y.     columns  have  no  elements  above  the  posi- 
iP 

tive  element  corresponding  to  the  row  in  which  a 

particular    y.       is  basic  in  a  row  of  (4) .  Thus 
iP 

(4)   and   (5)   qualify  to  be  T  constraints. 

The  immediate  advantage  in  recognizing  the  T  con- 


Min       T.     F.y.     +       T.  c.x.. 

,  ^  if'it  .  .  ^  i]t  i]t 
i,t  i,J,t 


(9) 


s.t. 


Ex..  =1 
i 


-y.     +  y.  <  0 

-'it        i,t-l  — 


X .  .    <  y . 
ijt  -  "it 


i]t     It  — 


all  j,  t  (10) 


all  i,  t  (11)T 


all  i,  j,  t  (12)T 


(13) 


y        is  integer   (0,  1) 
it 


(14) 


where 


F.^  =  fixed  cost  of  operating  plant    i  in 
period  t 

c.  .     =  cost  of  transporting  a  unit  from  i 
to     j     in  period  t 

x^.^  =  fraction  of  requirement  at    j  supplied 
by    i     in  period  t 

fl     if  plant    i     is  "open"  in  period  t 


It 


0  else. 


The  LP  relaxation  is:     Min   (9),   s.t.    (10)    -  (13). 
Now,  following  arguments  similar  to  those  in  exam- 
ple 1,  we  can  get  any  feasible  basis  into  form  (8). 
(11)  and  (12)  become  the  triangularity  constraints. 

To  summarize,  the  motivation  behind  identifying  T 
constraints  was  the  ability  to  get  any  feasible 
basis  into  the  form  (8) .     The  T  constraints  form 
the  lower  triangular  T,  and  in  the  examples  given 
above  it  would  be  seen  that  T  is  fairly  large. 
Thus,  considerable  savings  could  be  anticipated 
if  these  T  constraints  are  implicitly  represented. 
To  do  this,  we  exploit  some  special  properties  of 
triangular  matrices. 

3 .  Usefulness  of  the  Triangular  Submatrix 

The  following  properties  of  triangular  matrices 
make  it  convenient  to  have  a  large  lower  triangu- 
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lar  submatrix  in  the  basis. 

(i)   In  our  problem    T    is  nonsingular.  Thus 
T        exists.     T        is  also  lower  triangular. 

(ii)  T      can  be  computed  by  iterative  substi- 
tution. (In  general,     T        is  not  explicitly 
needed . ) 

(iii)  Pre-multiplication  of  matrices  or  vectors 

by  T  can  be  done  without  finding  T  exolic- 
itly.     For  instance,   let    R    be  an    n  x  m  matrix 

and  we  want    F  =  T  ''"R.     If  we  partition    F  and 
R    by  columns, 

f.  =  T"-^r.  . 
1  1 

Now  if  we  solve  the  system    Tf.   =r.    {i=l,  2, 

.  .  .  ,  n)  ,  by  back  substitution  we  can  get    T  '^R  in 
2  , 

mn  /2  multiplications.  In  using  the  revised  sim- 
plex method  pre-multiplications  of  this  type  occur 
several  times. 

(iv)  Both  (ii)  and  (iii)  are  simplified  if  T 
has  further  special  structures,  e.g.,  band  struc- 
tures. Some  practical  problems  tend  to  have  such 
band  matrices  in  place  of  T. 

4-  Implicit  Representation  of  T  Constraints 


x  .  >  0 


Constraint     z  +     Z     a„.x.  =  0    will  be  the  top  row 
j=l     °3  D 

(row  zero)   in  the  tableau.     If  we  had  the  full 

B     ,     the  top  row  of  this  would  have  been  the 
vector  of  dual  prices     {tt^}.     Also,  let    m  (<  M) 

be  the  last  explicit  ro'>',   i.e.,     T  constraints 
are  in  rows     m  +  1,  M. 

Construction  of  the  Pricing  Vector 

All  elements  of  the  pricing  vector     {ri^}     can  be 

computed  from  the  smaller  inverse  and  the  other 

information  we  carry.     If     i  <  m,     then     tt  .  is 
^  —  1 

simply  element     i     of  row  zero  of   (C  -DT  E) 

For     i  =  m,     we  can  compute     t:_^     starting  from 

i  =  M    as  follows.     Denote  by  b(i)   the  column 
basic  in  row    i.  Then, 


for     i  =  M:        E    TT  a      +  tt  a_         =  0 
k  )c]         M  Mb  (M) 


I.e.,  TT 


k=0  " 


mb(M) 


We  saw  earlier  that  for  problems  with  T  constraints 
a  feasible  basis  could  be  written  in  the  form: 


i   =  M  -   1:      J     Vkb(M-l)    +  VlVl,b(M-l) 
k=0 


B  = 


I 


E_      I  T 
With  this  form  of    B,     we  have 


(C  -  DT  •""£) 


(C  -  DT  ■""£)  "'"(-D)t'''" 


r  "'"(-E)(C-DT  """E)         T~'''(-E)  (C-DT~'''E)       ( -D)  T~''"+T  "'" 


(15) 

We  plan  to  carry  only  the  smaller  inverse 

(C  -  DT  """E)        and  not  the  larger    B     .  Informa- 
tion in  the  coefficient  matrix    A    would  be 
carried  in  part  explicitly  and  in  part  using 
pointers  and  link  lists.     The  next  step  is  to 
apply  the  revised  simplex  method  to  this  reduced 
working  basis.     To  do  this,  we  write  the  entire 
LP  in  standard  form: 


Max  z 


N 

E  a  .  X  , 
j=l     °^  ^ 


N 

s.t.      E    a..x.  =a 

j=l    13  3  10 


i  =  1,  2, 


.  ,  M 


\^Mb(M-l) 


In  neneral  for 


m 

E 
k=0 


k  k] 


M 
E 
p=i 


p  pb(p) 


In  fact,  we  do  not  always  have  to  start  computing 

TT.  's     from  the  last  row.     Certain    tt.  's,  namely 
1  1 

those  corresponding  to  columns  of    T    which  have 
only  the  diagonal  element  nonzero, are  also  direct- 
ly computed.     This  property  is  useful  in  imple- 
menting the  pricing  procedure.     When  we  need  a 

TT.,     i  >  m,     but  do  not  have  the  price  of  a  later 
1 

row,  we  find  the  column  basic  in  that  row  and 

price  it  out.     This  loop  will  produce  the  required 

TT.     as  soon  as  we  reach  a  column  with  only  the 
1 

diagonal  element  nonzero.     In  a  typical  implement- 
ation we  would  price  out  only  a  small  portion  of 
the  columns  at  any  iteration.     If  none  of  the 
columns  priced  have  elements  in  row    i,  then 
that    TT^     is  not  computed.     The  fact  that  a^_. 

are  usually  simple  elements   (0,  1,  -1)  also  helps 
in  reducing  multiplications. 

Selecting  an  Entering  Column 

M 

This  is  done  by  computing      E    't,  a   .     for  selected 

k=0    ^  ^3 
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nonbasic  colunuis.  The  only  saving  here  possibly 
would  be  by  using  the  knowledge  that  some  of  the 
a,  . ' s    are    +1    or    -1 . 

Generating  the  Updated  Representation  of  the 
Entering  Column 

Define  the  following  notation: 


a..  =  the  column  vector  consisting  of 

a.,  a.,...,  a. 

Oj      13  M] 


Df.     In  an  implementation  currently  under  way,  we 
plan  to  store     D    by  columns.     f     is  quite  sparse 
and  we  need  access  only  those  columns  of  D 
corresponding  to  the  few  nonzeros  of     f.  Because 

T  "'"a  ,     ,     ,  .     is  a  column  vector,     DT  '''a , 

(m+l;M)3  (m+l;M)3 

is  easily  computed.     Recalling  that 

(C  -  DT  ■''E)       is  the  smaller  inverse  we  are 

carrying,     a,„     , .     is  now  available. 
^     ^  (0;m)] 

Consider   (17)   simplifying  further. 


a  .  =  B  ^a  . 


a,  ■    =  T   """{-£)[  (C  -  DT   ""■£)  (C  -  DT  """E) 

(m+l;M) ] 


a. .  =  element    i    of    a.,     i=0,  1,  ...,M. 
13  •] 


-(D)T      a.  ,  +  T    a .     ^   ^,  . 

3  (m+l;M)] 


The  notation     (i;k)       when  used  as  a  subscript 
denotes  elements     i     through    k    along  the  associ- 
ated dimension,  e.g.,     a   ,  =  a,„       . . 

•3  (0;M)] 

If  j  is  the  entering  column,  then  we  want  to  get 
B  '''a__.  =  a  y     From  the  previously  encountered 

form  of    B  ■'^ ,     this  gives: 


=  ^"'<-^'^0;m)j  ^  ^''Nm+l;M)j 


-1, 

^^(m+l;M)j  ^°'(0;m)j 


=  f  -  T  Ea 


(0;m) j 


^0;m)j  =         -  Dt"'e)-^I:     "DT-^a  .. 


.L.1  MX  ■  =  [  (T~'''(-E)  (C  -  Dt'^^E)  :  T  ^(-E) 
(m+l;M) 3 


(C  -  Dt"'''E)    """(-DjT       +  t"""  ja 


Consider   (16) :     simplifying  further. 


(16) 


(17) 


We  already  have  f  and  a 
E  •  a 


(0;m) j' 


{0;m) j 

and  pre-multiply  by  T 


Now  we  compute 
Pre- 


multiplication  by  T 
already  discussed. 


-1 


follows  the  simpler  method 


a,     ,     ,  .,     and  hence  the  entire    a   ,,     is  now 
(m+l;M)] 


available . 


a        .  .  =   (C  -  Dt'^^E)   "^(a,^  +  ("Dt"''") 

(0;m)3  (0;m)] 


^(m+l;M) 


Selection  of  the  Pivot  Row 

This  is  done  very  much  in  the  usual  fashidn  by 
computing  ratios.     We  already  have  the  updated 
version  of  the  entering  column,  and  we  need  the 
updated  RHS : 


(DT  ^)a,     ,       .  is  computed  easily  if  we  use  the 
(m+l;M)] 

special  properties  of    T    stated  in  section  2. 
Let 


f  =  T  -"-a 


(m+l;M) j ■ 


Then 


Tf  =  a 


(m+l;M) j 


and  we  solve  this  system  as  in  Section  2.  Usually 

a,  contains  only  a  few  nonzero  elements, 

(m+l;M)] 

and  we  start  with  the  top-most  nonzero  element  in 

a,  The  row  corresponding  to  it  gives  the 

(m+l;M)3 

corresponding    f^.     The  next  step  is  to  compute 


C  „  =  B    a  „ 
•0  -0 


a  ,„     > „     is  given  by  : 
(0;m)0 


"(C-nOO  =         -  '^^"'^'''<"(0;m),0  "  °^''^m+l;M)  0> 


We  could  follow  the  same  steps  in  the  previous 

section  and  compute    Ct,„  but  a  better  way 

(0;m)0 

to  carry    a,       ^   „     explicitly  and  update  each 
( 0 ; m ) ,0 


pivot  by  considering  it  as  another  column  of 
(C  -  dt~"'')e"''". 

a  is  computed  by  back  substituting  the 

(m+1 ;M) 0 

values  of  variables  basic  in  implicit  rows  that 

have  nonzero  coefficients  in  a  given  explicit  row. 

We  need  only  the  elements  of    " [4)0  ^^^^^ 

correspond  to    a,     ^       ■  f  0.     The  actual  proce- 
(m+l;M)3 
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dure  to  get  the  required 


a.    ' s  by  back  substitu- 
lO 


tion  is  similar  to  the  looping  discussed  in 
constructing  the  pricing  vector  where  the  stair- 
case structure  of    T    is  exploited. 

Pivoting 

The  type  of  pivot  to  be  performed  depends  on 
whether  the  departing  column  is  basic  in  an  explic- 
it row  or  an  implicit  row.     Let  us  consider  the 
several  cases  that  may  arise. 


Let 


c  =  entering  column 

d  =  departing  column 


The  change  of  a  column  in  (D,  T) '     does  not  affect 

It  affects     DT  ■'"a,  if 
(0;m)b(i)  (m+l;M)b(i) 

column    b(i)     has  a  nonzero  in    E     in  any  row  that 

communicates  with  row    r    of    T.     In  other  words, 

there  will  be  no  effect  if    a,      ,,,.,=  0,  where 

(q;r)b(i) 

q    is  the  largest  numbered  row  in    T    above  r, 
which  has  only  a  lone  nonzero  element  (the  +1  in 


the  diagonal)  and  such  that  all  a 


rk 


0  for 


k  <  q.  The  reason  for  this  lies  in  the  fact  that 
under  these  circumstances,  the  solution  to 


(m+l;M)b(i) 


Case  (i) 
row. 


row  in  which  column    d     is  basic. 


Departing  column  basic  in  an  explicit 


remains  the  same  even  after  the  change  in  T. 

Once  we  have  identified  a  column    b(i)     for  which 

all  zeroes  in  a,  .,  ...  are  not  zero,  we  com- 
pute  (q;r)b(i) 


Column  c  replaces  d  in  (C,  E) ' .  The  change 
affects  only    C    and    E,     and  the  effect  is  to. 

-1. 


to  change  only  one  column  in 
-1 


(C 


DT     E) .  The  new 


column  in   (C  -  DT  E) 
In  computing  the  updated  version  of 


a,^     <     -  DT 
(0;m)c 


^(m+l;M) 
we  had 


^(0;m)b(i)    '  °n'^n     ^  (m+1  ;M)  b  (i) 


using  the  new    D    and    T.     Premultiplication  of 
this  by  the  current    R    gives  us  the  pivot  column. 
The  pivot  is  made  in  row    i . 


°(0;m)c  =        -  °^"'^>"'(N0;m)c  "  '^^"'^m+1 ;  M)  c^  • 

Thus  the  new     (C  -  DT  "'"E)         is  obtained  by  using 

ci/^     v„    as  pivot  column  and  pivoting  on  element 
(0;m)C 

a     .     Thus,  this  pivot  is  identical  to  a  conven- 
rc 

tional  revised  simplex  pivot. 

Case   (ii)  a.     Departing  column  basic  in  an  implic- 
it   row,  and    c     can  directly  replace    d    in  row 
r ,     i.e.,     r  >  m    and    a .     =0     for    m  <  i  <  r , 

a  =1. 
rc 


Case  (ii)  b.  Departing  column  basic  in  an  implic- 
it row,  but  c  cannot  replace  d  in  row  r,  i.e. 
the  entering  column  not  of  special  form  as  in  case 
(ii)  a. 

This  means  that  the  entering  column  (call  it  t) 
cannot  replace    d.     If  it  did,  it  would  destroy 
the  partial  triangular  structure.     As  in  GVUB,  we 
handle  this  in  two  steps. 

Step  1.  Find  a  column  c  currently  basic  in  an 
explicit  row  that  can  replace  d.  Such  a  column 
exists,  because  the  new  basis — with  t  in  place 
of  d — is  a  feasible  basis,  and  it  too  should  be 
adjustable  to  form  (8) . 


Now  a  column  is  changed  in  (DT) ' .     The  difficult 
part  in  this  case,  as  well  as  in  the  next  case, 
(ii)  b,  is  identifying  which  columns  of 

(C  -  DT  ''"E)     have  been  affected  by  this  column 
change. 


Let    0/n    as  a  subscript  denote  the  old/new 
matrix;  e.g. , 
column  in  (DT) ' 


T^"*"    is    T  "'■    before  change  of  a 


Let 


R  =  (C  -  DT~ ■'■£)"■'■ 


We  have  to  repivot  in  any  row    i     (i  =  1,  . . . ,  m) 

for  which 


R(a 


(0;m)b(i) 


n       {m+l;M)b(i)'   ^  i 


where    e^    is  a  0  vector  except  for  a  1  in  row  i. 


Once  c  is  found,  step  2  is  like  case  (ii)  a. 
As  many  pivots  as  there  are  columns  changed  in 


(C  -  DT 


would  be  needed  to  update  its  inverse 


Step  2 .     Now  replace    c    by  t. 

This  is  like  case  (i) ,  and  requires  a  pivot  on  a 


Pt 


using 


a,„     ,      as  the  pivot  column. 
(0;m)t  ^ 


Though  it  is  more  complicated  than  the  VUB  and 
GVUB  implicit  representation,  the  implicit  repre- 
sentation of  triangularity  constraints  also  saves 
much  computation  in  the  early  steps  of  the  revised 
simplex  method.     The  pivoting  step,  particularly 
case  (ii)  a,  may  involve  several  pivot  operations, 
but  this  may  well  be  compensated  by  the  savings 
due  to  having  these  operations  performed  in  the 
smaller  inverse  instead  of  the  full  inverse. 
Storage-wise,  implicit  representation  is  efficient 
in  that  it  enables  us  to  store  problems  with  a 
large  numbe  r  of  triangularity  constraints  entirely 
in  main  memory. 
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ABSTRACT 


The  Z-transform  technique  is  applied  to  the 
optimization  problem  consisting  of  a  quadratic 
objective  function  and  linear  difference  equations. 
A  rigorous  proof  of  the  general  solution  is  offered, 
and  a  computational  method  is  described  which  is 
likely  to  be  more  efficient  than  matrix  inversion 
or  matrix  iteration  approaches. 


INTRODUCTION 

In  a  recent  paper.  Hay  and  Holt  /4/  have 
presented  a  general  solution  for  the  optimization 
problem  in  which  the  objective  function  is  quadratic, 
and  the  variables  are  subject  to  linear  difference 
equations,  utilizing  a  technique  known  as  the 
Z-transform.     This  technique  is  an  alternative  to 
other  solution  methods  currently  in  use,  such  as 
matrix  inversion  (e.g.  Theil  /7/  and  matrix 
iteration   (e.g.  Bellman  /I, p.   78  et.seq./.  Chow  /2/) . 
While  the  developments  in  /4/  have  contributed  some 
additional  insight  into  the  class  of  quadratic- 
linear  optimization  problems,  the  Z-transform 
technique  as  presented  seemed  to  have  little 
computational  significance,  for  as  the  authors 
stated,  "...  matrix  iteration  or  matrix  inversion 
are  likely  to  be  computationally  superior  approaches..", 
/op.cit.  p.  257/. 

In  this  paper,  I  return  to  the  Z-transform 
technique  for  solving  quadratic-linear  optimization 
problems.     I  pay  particular  attention  to  the  question 
of  computational  efficiency,  and  reach  the  conclusion 
that  the  Z-transform  technique  can  in  fact  lead  to 
solution  methods  computationally  superior  to  either 
matrix  iteration  or  matrix  inversion,  at  least  under 
certain  stationarity  assumptions,  contrary  to  what 
is  suggested  in  /4/.     In  arriving  at  this  conclusion, 
I  suggest  an  algorithm  that  is  different  from  that 
in  /4/,  in  the  way  certain  characteristic  roots  are 
determined.     These  developments  were  made  as  part  of 
an  investigation  /8/  of  an  educational  planning 
problem,  where  because  of  the  large  number  of  system 
variables,  and  the  long  lead  times  involved  (which 
result  in  difference  equations  of  high  order) , 
computational  efficiency  is  placed  at  a  premium. 

Altogether    apart  from  the  issue  of 
computational  efficiency,  the  proof  of  the  general 
solution  is  different  in  certain  respects  from  that 


found  in  /4/.     Specifically,  the  proof  is  both 
simpler  and  more  rigorous.     Not  only  that,  but  the 
particular  notational  devices  used  in  constructing 
the  proof,  lead  naturally  to  the  more  efficient 
algorithm  developed,  in  contrast  to  /4/,  where  the 
"otational  devices  are  such  that  it  is  not  so  easy 
to  see  the   (computationally)   important  equivalence 
between  the  characteristic  root  problem,  and  the 
eigenvalue  problem  for  a  suitably  constructed 
matrix.     Thus  this  paper  sets  out  to  make  a 
contribution  at  two  levels:     firstly,   in  respect  of 
computational  efficiency,  and  secondly  in  respect 
of  a  further  consolidation  of  the  theory  underlying 
the  use  of  Z-transform  techniques  for  the  quadratic- 
linear  optimization  problem. 

This  paper  is  structured  into  two  parts.  Part 
1  describes  the  basic  approach  of  the  Z-transform 
method  of  solution,  following  /4/  but  utilizing 
different  notational  devices.     Part  2  is  where  the 
real  contribution  of  this  paper  lies,  addressing 
the  two  critical  issues  arising  from  the  Z-transform 
approach,  viz.    (i)  the  question  of  the  existence  of 
certain  roots  needed  in  the  solution  procedure;  and 
(ii)  the  question  of  an  efficient  computational 
routine  for  locating  those  roots.     The  relative 
efficiencies  of  alternative  computational  approaches 
are  then  compared,  and  the  paper  concluded. 


PART   1  -  THE  Z-TRANSFORM  APPROACH 

In  this  part  of  the  paper,  the  Z-transform 
approach  to  quadratic-linear  optimization  problems 
is  described.     Following  a  general  statement  of  the 
problem,  the   (first-order)  conditions  for  a  maximum 
are  derived.     Some  restricting  (stationarity) 
assumptions  are  then  introduced,  and  the  Z-transform 
technique  used  to  show  how  a  solution  could  be 
derived  from  the  system  of  linear  equations  defining 
the  maximum  conditions. 

THE  GENERAL  PROBLEM 

The  problem  under  consideration,  following 
Theil' s  /7/  canonical  form,   is  the  following :- 

Find  y  which  maximizes 

$   (x,y)  =  a'x  +  b'y  +  ijx'Ax  +  hy'By  +  hx'Cy 

+  ^jy'C'x  (la) 

s.t.        X  =  Ry  +  s.  (lb) 
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That  is,  we  have  a  quadratic  criterion  in  two  sets 
of  variables  x  and  y.     The  set  of  variables  y  are 
the  decision  variables,  which  (it  is  assumed)  can 
be  manipulated  by  the  decision-maker.     The  set  of 
variables  x  are  called  state  variables.  These 
are  affected  by  the  decision  variables  in  accord- 
ance with  the  linear  side  relations  given  in 
equation   (lb) .     The  state  variables  are  also 
affected  by  the  set  of  variables  s,  which  are 
uncontrollable  from  the  decision-maker's  viewpoint, 
and  exogenously  specified.     The  parameters  of  the 
problem  are  the  vectors  a  and  b,  and  the  matrices 
A,  B,  C,  and  R. 

The  sets  of  variables  x,  y  and  s  are  time- 
partitioned  as  follows :- 


i.e.       $   (y)  =kQ+k'y+y'Ky 


— 1 '  — 2 '    • • • '  2t 


i^'  ^1 ^-1 


STxl 


DTxl 


STxl 


(2) 


That  is ,  for  each  of  T  periods  in  the  planning 
horizon,  there  are  S  state  variables,  S  exogenous 
variables,  and  D  decision  variables,  so  that  x  is 
of  dimension  STxl,  y  is  of  dimension  DTxl,  and  s 
is  of  dimension  STxl. 

The  vector  and  matrix  parameters  a,b,A,B,C,R, 
are  of  dimensions  which  conform  to  x  and  y.  That 
is,    a  is  STXl,   b   is  DTxl,   A   is   STxl,    B   is  DTXDT, 
C  is  STxDT  and  R  is  STxDT. 

The  linear  decision  rule  consists  in  finding 
the  optimal  /trst-period  decisions,  that  is  y^ , 

when  the  planning  horizon  T  ->-        as  a  function  of 

the  exogenous  variables,  the  asterisk  in  y*,  being 

—0 

used  to  denote  optimality.  Note  here  that  when 
X,  y  and  s  are  as  defined  in  (2) ,  the  exogenous 
variable  s_^  can  be  expressed  as  a  function  of  the 

most  recently  observed  state  Xq.     For  this  reason, 

writing  a  first-period  linear  decision  rule  as  a 
function  of  the  exogenous  variables  s^,  s^^, 

is  equivalent  to  a  feedback  control  law,  (Chow  /2/) 
in  which  the  optimal  decision  in  the  t-th  period, 
y;*,  is  written  as  a  function  of  the  most  recently 


observed  state  x 


t-1* 


In  this  paper,  therefore. 


and  contrary  to  Chow's  /2,  p.   17/  viewpoint,  we 
take  the    practical  view  that  we  are  dealing 
essentially  with  differences  in  computational 
method  when  we  compare  matrix  iteration,  matrix 
inversion  and  Z-transforra  methods  for  solving 
the  problem  stated. 

MAXIMUM  CONDITIONS 

The  formal  first  order  conditions  for  a 
maximum,  are  obtained  by  substituting  for  x  in  (la) 
by  making  use  of   (lb) ,  and  then  differentiating  and 
setting  the  result  equal  to  zero   (Theil  /7/) ,  as 
follows:     Substituting   (lb)   into   (la),  we  get:- 

^   (y)  =  a'  (Ry  +  s)   +  b '  y  +  »5  (Ry  +  s)  '  A  (Ry  +  s) 
+  ijy'By  +  55  (Ry  +  s)'Cy  +  '5y'C'(Ry  +  s) 

(3) 


where , 


k^  =  a's  +  ^s'As 

k  =a'R+b'+R'As+s'C 
K     =  IjfR'AR  +  B  +  R'C  +  C'R} 


(4) 

(5a) 

(5b) 
(5c) 


Differentiating  (4)  with  respect  to  y  and  setting 
the  result  equal  to  zero,  we  get  the  formal  first 
order  conditions  for  a  maximum,  thus:- 


Ky'  =  -k 


(6) 


We  are  concerned  in  this  paper  only  with 
methods  for  solving  the  system  of  equations  given 
by  the  first-order  conditions.     The  second-order 
conditions  for  a  maximum  are  not  discussed  here, 
although  it  may  be  remarked  in  passing  that 
sufficient,  but  not  necessary,  second-order 
conditions  are  that  K  be  negative-definite.  See 
Theil  /I /  for  a  discussion  of  this  issue. 

ASSUMPTIONS 

We  now  introduce  the  following  assumptions : - 

(i)     Assume  that  A,  B  and  C  are  real,  band 
block-diagonal  matrices,  with   (time  -) 
invariant  band,  and  that  A  and  B  are 
also  symmetric. 

(ii)     Assume  that  R  is  lower  band  block- 
diagonal  and  real,  with   (time  -) 
invariant  band. 

(iii)     Assume  that  all  variables  and  parameters 
are  real,  if  not  already  so  specified. 

(iv)     Assume  that  the  parameters  A,  B,  C  and 
R  have  values  such  that  the  second- 
order  conditions  are  satisfied. 

Underlying  assiimptions    (i)  and   (ii)   is  the 
more  basic  assumption  that  cross-temporal  effects, 
both  in  the  criterion  function  and  in  the  system 
dynamics   (the  linear  side-relations),  are  limited 
in  the  number  of  periods  they  span.  Furthermore, 
the  assumption  is  made  that  these  cross-temporal 
effects  are  time-invariant,  so  that  for  any  given 
cross-temporal  span,  the  relevant  parameters  are 
the  same,  regardless  of  where  in  the  planning 
horizon  the  span  is  actually  taken.  Considering 
the  fact  that  we  are  letting  T      <»,  this  would  seem 
to  be  a  necessary  assumption  in  most  cases. 

The  symmetry  assumption  for  A  and  B  can  be 
made  without  loss  of  generality  since  we  are  dealing 
with  quadratic  forms.     The  assumption  that  R  is 
lowev  band  block-diagonal  amounts  to  saying  that 
current  values  of  the  state  variables  are  dependent 
only  on  previous  decisions,  and  not  at  all  on  future 
decisions,  which  would  seem  to  be  a  necessary 
assumption  in  most  problem  situations. 

The  assumption  that  all  parameters  are  real 
needs  no  justification.     It  would  be  difficult  to 
give  a  practical  interpretation  of  complex 
parameters  in  most  applications. 
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Following  on  assumptions   (i)  and   (ii) ,  we 
have  the  result  that  K  must  necessarily  be  band 
block-diagonal,  symmetric  and  real.     This  result 
can  be  quite  easily  proved,  though  with  some  tedium 
using  the  defining  relation  {5c) . 

Taking  the  relevant  cross-temporal  'span'  of 
the  system  to  be  'd'  periods,  therefore   (this  may 
be  a  system  lead  time;  for  example,  in  an 
educational  system,  the  lead  time  between  initial 
enrolment  and  final  graduation  may  be  considered  to 
be   (at  most)  six  years,  or  periods,  in  which  case 
d   =6),     and  with  the  symbol   (')  denoting  trans- 
position, we  may  write  K  as  follows :- 


K  = 


K. 


d-1 


d-l 


d-1 


■^d-l 


d-1 


d-1 


d-1 


^0 


d-1 


DTxDT 


d-1 


(7) 


This  is  merely  a  general  characterisation  of  a 
band  block-diagonal,  real,  symmetric  matrix.  When 
T  ->        the  infinite-band  portion  has  "width"  2d+l, 
made  up  of  the  2d+l  block  elements  K^,   ...    ,  ^, 


'^d'  Vl' 


Note  that  in  this  character- 


isation we  must  have  that        is  symmetric,  which 

it  is,  from  (5c).     Note  also  that  each  block 
element  has  dimension  D>^D,  and  can  be  obtained 
from  (5c)  by  first  time-partitioning  A,  B,  C,  and 
R,  into  block  elements  conforming  to  Assxmptions 
(i)  and   (ii) ,  and  then  carrying  out  the  indicated 
operations.     The  result   (7)  which  then  follows 
will  form  the  basis  for  the  method  of  solution  to 
which  we  are  headed. 

METHOD  OF  SOLUTION 


Substituting   (7)   into   (6)  ,  letting  T  ^ 
and  taking  the  Z-transform^  of  the  result,  we  get: 


2d-l 

S    Q.  (z)y*  +  zS(z)Z(y*)  =  Z(k  ) 
i=0     ^      ^  -t 


(8) 


where  Z(.)  denotes  the  Z-transform  of  a  (vector, 
in  this  case)   sequence  of  variables;  Q^(z), 


i  =  0, 


2d-l/  are  the  polynomial  matrices  that 


premultiply    yj,  yj, 


yt ,  , ,  which  are 
-2d-l 


obtained  upon  taking  the  Z-transform,  and  F(z)  is 
the  aharaateristio  polynomial  (matrix)  associated 


with  the  infinite-band  portion  of  (7) .  F  (z)  is 
defined  by  the  following:- 


F(z) 


d-1 

=  Z 
i=0 


z"k'. 
1 


d 

+  Z 
i=0 


z  K 


d-i 


(9) 


Note  that  for  any  arbitrary  value  of  the 
variable  z   (8)   is  an  equation  in  the  (2d+l)D 


variables  y^,  y*. 


Now  multiplying   (8)   through  by  F(z),  the 
adjoint  of  F(z),  and  making  use  of  the  fact  that 
for  any  general  square  matrix  A,  AA  =  |a|i,  where  A 
denotes  the  adjoint  of  A,    |a|  denotes  the  determin- 
ant of  A,  and  I  the  identity  matrix,  we  get:- 


'  2d-l 

F  {    Z      Q^y*}  +  z  !F|z(y^) 
i=0 


F  Z(k  ) 


(10) 


where  the  z  arguments  denoting  the  dependence  of 
the  polynomial  matrices  F,  F,  Q^,  Q^,   ...    ,  Q^^  ^ 

on  z,  have  been  dropped  for  notational  simplicity. 
Now,  if  we  choose  z  within  the  unit  circle 


such  that  F{z) 


0,  the  term  in  Z (y*)  drops  out; 


and  if  further,  the  sequence  'fk^'  t  =  0,1,2,  ...} 
is  bounded  above,  then  Z(k^)  is  finite.^  We  would 
thus  have  an  equation  in  only  the  2dD  variables 


'  ^2d-l- 


Such  a  z  would  be  by 


definition  a  root  of  the  characteristic  polynomial 

|f(z) |.     Furthermore,  we  know  from  the  result  in 

Linear  Algebra  /3 ,  p.  61/  that  F(z)  has  rank  unity 

for  any  root  of  |f(z)|.     Consequently,  substitution 

in  (10)  of  any  z  ,  say,  within  the  unit  circle, 
m 

which  is  a  root  of   |f(z) | ,  yields  exactly  one 
linearly  independent  equation  in  the  2dD  variables 
V* 


y*^  ^,  obtained  by  taking  a  single 


non-zero  row  from  F (z  ) 
m 


Denoting  such  a  row  by 


f  (IxD) 
ra 


we  have  on  substitution  of  a  root  z  into 

m 


(10) ,  one  linearly  independent  equation  defined 
thus : - 

2d-l 


f     {     t     Q. (z  )y^}  =  f  Z{k  ) 
m       .   „     1    m  — 1         m  — t 
1=0 


(11) 


In  order  to  solve  for  the  2dD  variables 
y^,  y*,   ...    ,  y2(3_]^'  ^®  need  exactly  2dD  linearly 

independent  equations.     We  may  obtain  dD  linearly 
independent  equations  from  the  original  conditions 
for  a  local  maximum.     Observe  there  that  the  first 
dD  equations  contain  only  the  2dD  variables 
y^,  Y*^,   ...    ,  The  remaining  dD  linearly 

independent  equations  may  be  obtained  from  equation 
(11),  by  substituting  dD  roots  z^  satisfying  |z^I<l, 

m  =  1,2,   ...    ,  dD.     It  is  obvious  that  this  pro- 
cedure can  only  work  if  there  are  exactly  dD  roots 
of  |f(z)|   lying  within  the  unit  circle  of  the 
complex  plane.     Furthermore,  the  computational 
efficiency  of  this  procedure  depends  critically 

upon  the  efficiency  with  which  the  roots  z  , 

m 

m  =  1,2,   ...    ,  dD  can  be  found.     The  rest  of  this 
paper  addresses  itself  to  these  two  questions. 
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Note  in  passing  that  the  occurence  of  complex 
roots  poses  no  insuperable  problems.     See  /4,   5,  8/ 
for  methods  of  handling  complex  roots. 


Proof :  The  proof  follows  easily  from  the 
definition  of  a  symmetrical  polynomial,  and  is 
therefore  omitted. 


PART  2  -  A  PROOF  OF  EXISTENCE,  AND 
A  METHOD  FOR  FINDING  THE  ROOTS 

In  this  part  of  the  paper,  firstly  we  prove 
the  existence  of  exactly  dD  roots  of  |f(z)|  lying 
within  the  unit  circle  of  the  complex  plane,  using 
a  different  and  more  rigorous  approach  than  Hay 
and  Holt  /4/.     Secondly,  we  describe  an  efficient 
method  for  obtaining  the  dD  roots  of   |f{z) |  lying 
within  the  unit  circle.     Thirdly,  a  brief 
comparison  is  made  of  the  relative  efficiencies 
of  the  alternative  approaches. 

EXISTENCE  PROOF 

To  prove  the  existence  of  dD  roots  of   |f(z) | 
lying  within  the  unit  circle  of  the  complex  plane, 
the  approach  is  as  follows:     The  concept  of  a 
syrmetrical  polynomial   (Definition  1)  is  introduced, 
and  it  is  proved  that  symmetrical  polynomials  have 
an  even  number  of  roots,  half  of  which  lie  within 
the  unit  circle  of  the  complex  plane   (Theorem  1) . 
It  is  then  shown  that  |f(z)|   is  a  symmetrical 
polynomial  of  degree  2dD   (Theorem  2) ,  from  which 
the  required  result  immediately  follows    (Theorem  3). 
To  prove  that  |f(z)|   is  symmetrical,   it  is  necessary 
to  introduce  the  concept  of  the  reverse  of  a 
polynomial   (Definition  2)   and  to  prove  first, 
several  lemnata  involving  this,  and  the  concept  of 
symmetrical  polynomials. 

DEFINITION  1:     A  symmetrical'^  polynomial  is  a 

2n  r 

polynomial  of  even  degree  2n,  say,     Sax  such 


that  a    =  a^      ,  r 
r  2n-r 


f  (x) 


r=0  r 
0,   ...   ,  n.     Example :- 

3  +  2x  +  4x^  +  2x3  +  3x^ 


is  symmetrical. 

DEFINITION  2:     The  reverse  of  an  n-degree 
n  X 

polynomial     I„  a  x      is  defined  to  be  the  n-degree 
r=0  r 

n  X  r  ^ 

polynomial         a      x  .     Let  Ri . )  denote  the  reverse 
r=0  n-r 

of  a  polynomial.     Example :- 

r{3  +  2x  +  4x^}  =  4  +  2x  +  3x^ 


THEOREM  1:    A  symmetrical  polynomial  of  degree 
2n  possesses  exactly  n  roots,  x^,  i  =  1,  ...  ,  n, 

some  possibly  repeated,  satisfying  \x^\<l, 

i  =  1,  ...  ,  n;  provided  only  that  there  is  no  root 
X  such  that  \x\  =  1. 

Proof :     For  any   (in  general  complex)  number  x, 
either   |  x  |  <1  or  |x]>l,  if  1x1  ?^1. 


Furthermore,    |x|>l  <=> 


<1. 


Now,  by  Lemma  1,   f (x)   =0  <=>  f {— )  =  0.  More- 

X 

over,  being  of  degree  2n,  f (x)  must  have  exactly  2n 
roots    (some  possibly  repeated) . 

By  symmetry,  therefore,  f (x)  must  have  exactly 
n  roots  satisfying  |x|<l  and  exactly  n  satisfying 
|x|>l   (some  possibly  repeated);  provided  only  that 
there  is  no  root  x  such  that  |x|  =1.  QED. 

LEMMA  2 :  The  sum  of  symmetrical  polynomials  < 
of  the  same  degree  is  also  symmetrical.  i 

Proof:     The  proof  is  obvious  and  therefore 
omitted. 

LEMMA  3:    The  Sum  of  an  even-degree  polynomial 
and  its  reverse  is  a  symmetrical  polynomial.  That 
is,  f(x)  +  B[f(x)}  is  symmetrical  if  f(x)  is  of 
even  degree. 

Proof :     The  proof  follows  easily  from  the 
definitions  of  a  symmetrical  polynomial  and  the 
reverse  of  a  polynomial,  and  is  therefore  omitted. 

LEMMA  4:  The  reverse  of  a  product  of  j 
polynomials  is  the  product  of  reverses  of  the  i| 
individual  polynomials  making  up  the  product.    That  ' 

.  -RCf  }. 

P 


is,  R{f^-f^-   ...  .fp} 


Rff^l-Rff^}' 


Proof :     The  proof  is  by  induction.     It  is 
shown  first  of  all  that  if  the  result  holds  true 
for  the  product  of  p  (p  >_  2)  polynomials,  that  it 
must  also  hold  for  the  product  of  p+1  polynomials. 

Assume 


The  reverse  of  an  n-degree  polynomial  f (x)  may 
be  defined  alternatively  as  follows :- 

R{f(x)}  E  x"f(-) 

X 

The  equivalence  of  the  two  alternative 
definitions  may  be  easily  proved.     Example :- 


R{3  +  2x  +  4x  }  =  x^(3  +  -  +  -2)  =  4  +  2x  +  3x^ 
as  before. 


LEMMA  1:    If  X  is  a  root  of  the  symmetrical 

2 

polynomzal  f(x),  then  its  reciprocal  —  is  also  a 
root.     That  is,  if  f(x)  is  symmetrical,  f(x)  = 
0  <=>  f(l)  =  0. 


R{f,-f2- 


Then, 


R{f,.f2- 


•f  } 
P 


R{f^}'R{f2}- 


•R{f  }. 
P 


•f  -f  ^, } 

p  p+1 


R{f^'f2}'R{f3}- 


R{f^}-R{f2}-R{f3}- 


•R{f  ^,}. 
p+1 


(12) 


•R{f 


p+1 


(13) 


That  is  the  result  must  then  also  hold  for  p+1 
polynomials. 

Secondly,  the  result  is  proved  for  two 
polynomials.     That  is,  we  wish  to  show 
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RCf^'f^}  =  R{f j^}*R{f Recall  first  from  the 
definition,  that  for  an  n-degree  polynomial  f (x) , 
R{f(x)}  =  x"f (^) .     Now  suppose  that  f ^  (x)   is  of 
degree  n  and  that  f 2  (x)  is  of  degree  m.  Then, 


R{f.}-R{f-}  =  x"*"f. (i)  f,(i), 
1  2  1  x      2  X 


by  definition  of  reverse.     Now,  let 


fj^Cxj-f^Cx)  =  $(x) 


Thus 


n+m  .  ,1. 
=  X       $  (-)  , 
x 


by  definition  of  reverse;  that  is 
R(f,.f2}  =  x-%(i).f2(i), 


(14) 


(15) 


(16) 


of  f .  .  (z) ;   that  is  f . .  (z)   =  R{f .  .  (z) } ,   i, j ,  .  .  .  ,  D. 

1]  31  13 


226/ 


Now  by  definition  of  a  determinant  /6,  pp.  225, 


|f(z)|  =  X  '=^"'k  iSi  ^a^(i),i'^^' 


(19) 


where  0^(1),    ...    ,  Oj^(D)   is  the  k-th  permutation  of 

the  integers  1,   ...    ,  D  where  the  sum  is  taken  over 
all  such  permutations.     The  symbol   (sgn)^  denotes 

+1  if  the  integers  a   (1) ,   ...    ,  a    (D)  can  be  con- 
K  K 

verted  into  1,   ...    ,  D  by  an  even  number  of  inter- 
changes; otherwise,    (sgn)j^  =  -1. 

Note  that  every  term  in  the  summation  consists 
of  a  product  of  D  polynomials  each  of  degree  2d. 
Consequently   |f(z)|   is  a  polynomial  of   (even)  degree 
2dD,  provided  of  course  that  K^^  is  non-null,  which 

we  may  assume  without  loss  of  generality. 


by  (15).     Hence,  by  (14) 

R{fj^'f2}  =  R{fj^}'R{f2} 


QED. 


LEMMA  5;    An  even-degree  polynomial  which  is 
its  own  reverse  is  a  syrmetriaal  ■polynomial.  That 
is,  f(x)  =  R[f(x)]  =>  f(x)  is  symmetrical. 


Furthermore,  every  term  in  the  summation  is 
either  symmetrical,  or  has  its  reverse  in  the 
summation.     To  prove  this  assertion,  consider  the 
k-th  term  in  the  summation: 


^^^"'k  iSiV  (i),i'^'' 

k 


(20) 


Proof :  The  proof  follows  easily  from  the 
definition  of  a  symmetrical  polynomial,  and  is 
therefore  omitted. 

THEOREM  2;    The  determinant  of  F(z) ,  where 
FCz)  is  the  polynomial  matrix  as  defined  in 
equation  (  9  ),  is  a  symmetrical  polynomial  of 
degree  2dD. 


since  f . .  (z)  =  R{f .  .  (z) } ,  i, j  =  1,   ...    ,  D  and  in 

Di  in 

view  of  Lemma  4,  the  reverse  of  the  k-th  term  can 
be  obtained  quite  simply  by  reversing  the  subscripts 
on  the  polynomials  in  the  product  and  keeping  the 
sign,  to  get:- 


(sgn),    .n,  f .  „    ,  ■  ,  (z) 
k  1=1     1,0^^  (1) 


(21) 


Proof:    F(z)  is  a  polynomial  matrix  whose 
ij-th  element  is  a  polynomial  of  (even)  degree  2d 
given  by  equation  (9) ,  elaborated  here  for  conven- 
ience. 


i, j  =  1,   ...    ,  D 


(17) 


Now  let  a     (1) ,  a     (2) ,  ...   ,  a  , (D)  be  the 

k  K  K 

permutation  of  1,2,   ...    ,  D  defined  by  the  first 
subscripts  on  rearranging   (21)  so  that  the  second 
subscripts  are  in  the  order  1,2,   ...    ,  D.  But 
0^^,(1),  0^,(2),   ...    ,  Oj^,  (D)   is  clearly  another 

permutation  of  1,2,. ..,D,  with  (sgn)j^=  (sgn)j^.  ,  and  so. 


(sgn),  ,    .n     f  .  (z) 

k'  1=1    a^,  (1) ,1 


where  (•)j^j  denotes  the  ij-th  element  of  the  matrix 
in  brackets. 

Making  use  of  the  fact  that  K,  =  K',  and  that 

a  a 

(K^)j.  =   (Vij'  ^  "  0'1'2»   ...    /  d-1,  by  the 

definition  of  transposition,  we  have, 

d  V      d-1  ,  d+k 

^ji^^>  =    kio  ^Vij^^klo  ^'^•d-k^ij^  ' 


i, j  =  1,   ...    ,  D 


(18) 


that  is  we  have  shown  that  f ..  (z)  is  the  reverse 


must  be  a  term  in  the  summation.     If  it  is  identical 
to  the  k-th  term,  then  by  Lemma  5  we  have  that  the 
k-th  term  is  symmetrical.     If  it  is  not  identical 
to  the  k-th  term,   it  is  by  construction,  the  reverse 
of  the  k-th  term.     Hence  we  have  the  result  that 
every  term  in  the  summation  in   (20)   is  either 
symmetrical  or  has  its  reverse  in  the  summation. 

Therefore,  by  Lemmas  2  and  3,  we  have  that 
the  summation  of  2dD-degree  polynomials  defining 
|f(z)|   is  symmetrical,  and  of  degree  2dD.  QED. 

THEOREM  3 :     l^^z^l  possesses  exactly  dD  roots, 
,  dD,  some  possibly  repeated,  satis- 


z .,  %  =  1,  ... 

fying  \z^\<l, 

there  exists  no  root  z  such  that  \z\ 


i  =  1, 


,  dD;  provided  only  that 
=  I.- 
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Proof :     By  Theorem  2,  we  have  that  |f(z)|   is  where  the  time-partition  k'   =  |k^,  k^,  k^,   ...j  has 


a  symmetrical  polynomial  of  degree  2dD,  which  is 
even. 

By  Theorem  1,  therefore,    1f(z)|  must  possess 
exactly  dD  roots,  z^,   i  =  1,    ...    ,  dD,  some 

possibly  repeated,  satisfying   |z^|<l;  provided 

only  that  there  exists  no  root  z  such  that  |z|  =  1. 

QED. 

FINDING  THE  ROOTS 

The  computational  efficiency  of  the 
Z-transform  method  of  solution  depends  critically 
upon  the  method  used  to  find  the  roots  of   |f(z) | . 
Hay  and  Holt  /4,  p.  242/  report  the  use  of  gradient 
methods.     The  efficiency  of  this  approach  is  not 
discussed  in  that  paper,  but  the  authors  seem  to 
suggest  /op.cit.,  p.  257/    that  matrix  iteration  or 
matrix  inversion  are  "...  likely  to  be  superior 
approaches  .  .  . "  .     The  method  that  is  suggested  in 
this  paper^  seems  able  to  make  the  Z-transform 
method  conputationally  superior  to  either  matrix 
iteration  or  matrix  inversion,  and  is  developed  in 
this  section. 


of 


The  method  is  as  follows:  To  find  the  roots 
|f(z) |,  first  form  the  2dDx2dD  matrix. 


'2d 


-I 


-I  0 


been  used.     Note  that   (23)  defines  a  set  of 
diffevenoe  equations  of  order  2d.     F (z)  is  the 
ohavaatevistic  polynomial  matrix  obtained  from 
taking  the  Z-transform  of  this  set  of  difference 
equations.     To  prove  the  theorem,    (23)  will  be 
written  as  an  equivalent  /irst-order  difference 
equation,  whose  characteristic  polynomial  is 
(zl  +  k).     The  roots  of  the  aharaateristia  equation 
obtained  in  either  case  must  be  identical,  and  in 
the  latter  case  the  roots  of  the  characteristic 
equation  are  the  eigenvalues  of  the  matrix  k,  by 
definition . ^ 


Now,  premultiplying   (23)  by  K 


-1 

0  ' 


we  get:- 


2d-l 

I  < 
i=0 


-1 


Now,  let 


2d-i4+i     ^  ^^+2d  =  ^0    ^it+d'  t  =  0,1,2  . 

(24) 


I^"   =4.2d-i'   i  =  2d; 


(25) 


t  =  0,1,2, 


and  define 


,(1) 


4^' 


t  =  0,1,2, 


(26) 


,(2d) 


The  2d-order  difference  equation  in  (24)  can  there- 
fore be  equivalently  written  as  the  following  first- 
order  difference  equation:- 


^  \  ^  \+l 


<1>^,   t  =  0,1,2, 


(27) 


where  k . 


K^,  i  =  1,   ...   ,  d 


^o'  ^2d-i'  i  = 


and  compute  its  2dD  eigenvalues  using  one  of  the 
several  available  computer  programmes  /9/  to  do 
this.     The  eigenvalues  of  this  matrix  are  the 
roots  of  |f(z)|.     Note  that  this  procedure 

requires  that  K^^  exists.     We  therefore  wish  to 

prove  the  following  theorem. 


where  we  have  let 


*t  = 


'^o''^  -t+d 


0 

2dDxl 


,   t  =  0,1,2,    . . . 


(28) 


THEOREM  4:    The  roots  of  \f(s)\  are  identical 
to  the  eigenvalues  of  <,  provided  that  exists. 


The  characteristic  equation  of  this  system  is 

Izl  +k|  =  0  (29) 


Proof:     Substituting  (7)  into  (6)  and  letting 
T  ->-'>=  as  before,  we  have  from  the  infinite-band 
portion: - 

iij  n  il^M  ^  iio  '^d-iil^+d+i 


0,1,2, 


(23) 


the  solution  to  which  is  given  by  the  eigenvalues 
of  <,  by  definition.     But,  from  (23)  the  character- 
istic equation  of  the  same  system  is  given  by 

|f{z) I  =0 

Hence,  by  a  fundamental  property  of  linear  dynamic 
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systems,  we  have  that  the  eigenvalues  of  k  must  be 
identical  to  the  roots  of  |f{z)|.  QED. 

COMPARISON  OF  APPROACHES 

In  the  Z-transform  method  suggested,  the 
number  of  arithmetic  operations  required  is 
dominated  by  two  steps,  viz.,  locating  the  eigen- 
values of  a  2dDx2dD  matrix,  and  inverting  a  2dDx2dD 
matrix.     Each  of  these  steps  require  in  the  order 
of   (2dD)-'  arithmetic  operations  to  be  carried  out, 
say  k(2dD)-^  altogether,  where  k  is  an  appropriate 
factor. 

Matrix  iteration  methods    (e.g.  Chow  /2/)  assume 

a  finite  horizon  T.     We  can  write  T  =  nd,  where  n 

is  a  number  chosen  so  that  T  approximates  infinity, 

insofar  as  the  convergence  of  the  first-period 

solution  is  affected.      (In  practice  this  would  seem 

to  require  n'j:4,  if  the  results  in  Theil  /7,  p.  167/ 

are  any  indication) .     The  matrix  iteration  method 

requires  a  number  of  arithmetic  operations 

dominated  at  each  iteration  by  the  multiplication 

of  two   (d+l)D  X    {d+l)D  matrices.      (This  assumes 

that  in  the  case  of  lagged  variables,  the  state 

vector  is  augmented  by  these  lagged  variables  so 

as  to  retain  the  first-order  system  dynamics 

assumed  by  Chow  /2/,  and  that  the  state  vector  is 

of  order  not  less  than  the  decision  vector) .  This 

3  3 

multiplication  requires  in  the  order  of  2(d+l)  D 

arithmetic  operations,  and  needs  to  be  carried  out 

nd  times  altogether.     Matrix  iteration  therefore 

3  3 

requires  in  the  order  of  2nd(d+l)'^D  operations. 
This  number  of  operations  would  therefore  exceed 
that  for  the  method  suggested  roughly  whenever 
2nd(d+l)'^  >  8kd-^.     Assuming  n  =  4,  this  is  always 
true  if  k  <  6,  which  will  almost  certainly  be  the 
case. 

Matrix  inversion   (e.g.  Theil  /7/)  would 
proceed  by  attempting  to  invert  the  large  matrix 
in  (7),  for  T  large  enough  to  approximate  infinity 
and  ensure  convergence  of  the  first-period 
solution.     The  number  of  arithmetic  operations 
required  to  perform  this  inversion  is  in  the 
order  of   (DT)-^.     Writing  T  =  nd,  as  before,  this 
is  in  the  order  of   (ndD) operations. 

Assuming  n  =  4,  this  number  would  exceed 
k(2dD)-^,  the  number  required  for  the  method 
suggested,  whenever  k  <  8,  which  again  will  almost 
certainly  be  the  case. 

These  are  very  rough  calculations,  but  they 
indicate  that  the  Z-transform  method  is  likely  to 
be  computationally  superior  to  either  matrix 
iteration  /I,  p.  78  et.seq. ,  2/  or  matrix  inversion 
/I / .    This  is  contrary  to  what  Hay  and  Holt  /4/ 
suggest,  but  is  based  on  a  different  method  of 
calculating  the  characteristic  roots,  than  the 
gradient  methods  to  which  they  allude  /op.cit., 
p.  242/  . 


CONCLUSION 

This  paper  has  sketched  an  algorithm  based  on 
Z-transform  methods,  for  the  numerical  solution  of 
the  optimization  problem  in  which  the  objective 
function  is  quadratic,  and  the  constraints  are 
linear  difference  equations.     It  is  shown  that  the 


method  suggested  is  likely  to  be  more  efficient  than 
either  matrix  inversion  or  matrix  iteration. 

Also  included  in  this  paper  is  a  proof  of  a 
theorem  guaranteeing  the  existence  of  the  requisite 
number  of  roots  of  a  characteristic  polynomial 
satisfying  conditions  for  the  successful  application 
of  the  Z-transform  method  to  the  problem.     A  recent 
proof  by  Hay  and  Holt  /4/  uses  a  different  approach 
and  is  perhaps  not  as  rigorous. 

The  not  too  restrictive  assumptions  on  which 
these  developments  have  been  based  are  that  the  span 
of  cross-temporal  effects  be  finite,  and  the  system 
parameters  be  time-invariant.     The  other  assumption 
that  all  parameters  be  real  needs  to  be  made  in  any 
case.     A  number  of  ancillary  issues  and  extensions 
discussed  in  Hay  and  Holt  /4,  p.  249/  are  not 
discussed  here,  specifically  the  treatment  of 
complex  roots,  zero  roots,  time-varying  parameters, 
and  time-discounting  of  the  objective  function. 
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FOOTNOTES 

^        For  a  scalar  sequence  {a^|t  =0,1,2,  ...  )  the 
Z-transform  of  this  sequence  is  defined  as 
Z(a^)  =  z  a^    where  z  is  a  complex  variable. 

The  Z-transform  of  a  vector  sequence  is  defined  as 
the  vector  whose  elements  are  the  Z-transforms  of 
the  individual  elements  of  the  original  vector.  See 
Frazer  et  al  /3/  for  a  discussion  of  the  algebra  of 
Z-transforms. 

°°  t 

More  precisely,  Z(k^)  =    E„  z  k    is  finite 
— t       t=o  — t 

if  the  term  k    grows  at  a  slower  rate  than  the  term 

z  declines. 

^        In  /8/  on  which  this  section  is  based,  the 
word  "symmetric"  was  used.     The  term  "symmetrical" 
is  used  here  in  a  narrowly  defined    sense,  without 
regard  to  any  other  meaning  which  might  be  attached 
to  the  word  in  the  mathematics  literature. 

"*        It  should  be  pointed  out  that  the  existence  of 
a  root  z  such  that  | z |  =  1  is  a  very  unlikely  hair- 
line case,  almost  impossible  to  achieve  in  numerical 
computations. 

^        In  the  case  where  (23)  were  a  scaiav  difference 
equation,  the  matrix  k  would  be  known  as  the 
adgaaenay  matrix,  and  Theorem  4  would  be  a  known 
result.     I  am  indebted  to  D.  Beckles  for  pointing 
this  out.     The  extension  to  vector  difference 
equations  implicit  in  Theorem  4,  though  somewhat 
straightforward,  I  haven't  been  able  to  find  in  the 
literature. 

^        First  developed  in  /8/  (unpublished) . 

^        It  is  well  known  that  matrix  inversion  requires 

in  the  order  of  n'^  operations  for  a  matrix  of  order 
n.     See  /6.  p.  69/.     The  assertion  that  in  the  order 

of  n"^  operations  are  required  to  locate  the  eigen- 
values of  a  general  matrix  of  order  n,  is  based  on 
an  analysis  of  algorithms  found  in  /9,  contributions 
11/12,  11/14/.     Also,  see  /6,  Ch.  10/. 
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THE  STRUCTURE  AND  SOLUTION  TECHNIQUES  OF 
THE  PROJECT  INDEPENDENCE  EVALUATION  SYSTEM 

FREDERIC  H.  MURPHY 
FEDERAL  ENERGY  ADMINISTRATION 


INTRODUCTION 

The  Project  Independence  Evaluation  System  (PIES)  forecasts  the  state  of  the  energy  economy  in  selected 
future  years  (1980,  1985  and  1990)  and  reflects  the  impacts  of  a  range  of  potential  Federal  policies  on 
the  prices  paid  for  energy  commodities,  on  the  level  of  demands  for  these  commodities,  and  on  the  level 
of  imports  of  oil.    The  methodology  used  assumes  that  the  role  of  government  is  to  establish  policies 
allowing  participants  in  the  economy  to  act  in  their  own  self-interest  within  the  constraints  imposed  by 
these  policies.    The  approach  taken  is  to  construct  models  for  the  different  components  of  the  energy 
system  and  then  integrate  the  submodels  or  the  outputs  of  the  submodels  into  a  forecast.    This  modulari- 
zation allows  for  the  ongoing  improvement  of  the  various  segments  of  PIES  without  having  to  alter  the 
entire  system. 

OVERVIEW 

There  is  a  set  of  supply  models  for  each  of  the  major  raw  materials,  coal,  oil  and  gas.    They  are  built 
to  simulate  the  response  of  the  industry  producing  the  raw  material  to  increases  and  decreases  in  prices 
and  are  used  to  construct  supply  curves.    Next,  miniature  models  of  refineries  and  electric  utilities 
transform  raw  materials  into  consumable  forms  of  energy.    Estimates  of  the  production  capabilities  of 
emerging  technologies  are  added  as  well.    The  products  consumed  within  the  system  are  six  petroleum 
products,  gasoline,  distillate,  residual,  jet  fuel,  liquid  petroleum  gases  and  other  products  from  crude 
oil  (lubes,  waxes,  etc.),  and  four  other  products,  natural  gas,  electricity,  bituminous  coal  and 
metallurgical  coal.    A  large  data  base  and  set  of  econometric  models  are  used  to  construct  a  demand 
model  which  estimates  how  the  demand  for  each  final  product  varies  as  the  price  of  that  product  and  the 
prices  of  other  products  change.    As  an  example  of  how  the  price  of  one  product  impacts  the  demand  for 
another,  natural  gas  can  be  replaced  by  distillate  for  many  industrial  uses  on  approximately  a  BTU  for 
BTU  basis  and  vice  versa.    Therefore,  if  the  per  BTU  price  of  one  fuel  gets  out  of  line  with  the  other 
and  both  are  available,  the  lower-cost  fuel  is  substituted  for  the  higher-cost  fuel.    Since  natural  gas 
and  distillate  are  not  perfect  substitutes,  more  fuel  switching  becomes  economic  and  occurs  in  the  model 
as  the  prices  of  the  two  commodities  continue  to  diverge.    The  demand  function  is  a  log-linear  approxima- 
tion to  a  set  of  sector-specific  demand  models,  that  is. 
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k       k     10     k  k 
InQj  =  aj  +  I     b     Inp    for  j  =  1,  ,10. 

1=1    ij  i 

k  k 

where  Q.  is  the  quantity  of  product  j  demanded  in  sector  k  when  fuels  sell  at  the  retail  prices  p  .  The 

J  k  i 

sectors  are  household,  commercial,  raw  material,  industrial  and  transportation.    When  i=j,  b     is  refer- 

k  ^'J 

red  to  as  the  own  elasticity  for  product  i  and  is  negative.    If  i/^j,  then  b^ ^  is  a  cross  elasticity  and 
in  our  case  is  positively  signed.    With  all  other  prices  constant,  an  x  percent  change  in  p    leads  to  an 
xblj-j  percent  change  in  the  demand  for  product  j.    A  full  discussion  is  contained  in  (2),  describing 
each  segment  of  the  demand  model  in  detail. 

Demands  are  forecasted  for  a  larger  slate  of  products  than  is  available  from  the  supply  structure. 
Also,  the  supply  prices  are  wholesale  as  opposed  to  retail  prices.    To  evaluate  the  demands  for  the  ten 
supplied  products  at  the  supply  (wholesale)  prices,  the  following  steps  are  taken.    First,  each  demand 
product  is  associated  with  one  of  the  supply  products,  e.g.,  petroleum  coke  with  other  petroleum 
products.    Markups  appropriate  for  going  from  wholesale  to  retail  prices  in  the  given  sector  are  then 
added  to  the  wholesale  prices  and  the  demand  equations  for  each  fuel  are  evaluated.    The  resultant 
quantities  are  then  aggregated  across  the  appropriate  demand  products  and  across  sectors  to  determine 
the  demand  for  the  supply  product  at  the  wholesale  price. 

The  various  components  of  the  system  are  tied  together  by  a  transportation  network  that  moves  raw 
materials  or  products  from  where  they  are  produced  to  where  they  are  consumed  or  where  they  are  used  to 
produce  other  energy  products.    The  flows  within  the  system  are  shown  in  Figure  1. 

FIGURE  1 

Flow  of  Materials 

Supply  Conversion  0«mand 


Shale  Regi< 


Oil  Regie 


Gas  Hegii 


Coal  Region: 


Imports 
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The  supply  functions,  the  demand  function,  and  the  conversion  activities  are  combined  by  the  PIES 
integrating  system.    The  output  of  the  integrating  system  is  a  general  equilibrium  solution  (Figure  2) 
of  the  mathematical  representation  of  the  energy  economy,  i.e.,  a  set  of  balanced  supplies  and  demands 
as  well  as  market  clearing  prices  for  each  fuel  is  provided.    Formally  stated,  the  problem  is  to  find 
a  vector  of  prices  p  such  that  the  vector  of  demands  D(p)  =  S(p)  the  vector  of  supplies  at  p. 

FIGURE  2 
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REGIONAL  STRUCTURE 

Each  raw  material  has  a  regional  structure  in  the  model  that  represents  the  unique  characteristics  of 
its  resource  base.    There  are  also  specific  regional  definitions  for  conversion  and  demand  activities. 
The  purpose  of  all  of  the  regional  detail  within  the  model  is  not  to  provide  results  for  regional 
analyses  but  to  provide  better  national  figures.    Western  low  sulfur  coal  would  implicitly  be  used  in 
New  England,  for  example,  if  transportation  cost  differentials  were  ignored.    Also,  more  nuclear  plants 
would  be  built  if  a  national  average  transportation  cost  were  used  for  coal. 


There  are  twelve  coal  regions  which  are  chosen  so  that  each  is  relatively  compact  and  contains  only  a 
few  coal  categories  of  bituminous  coal.    Some  regions  also  contain  metallurgical  coal.    Coal  is 
separated  into  five  BTU  categories,  and  each  of  the  bituminous  categories  is  divided  into  three  sulfur 
types:    high,  medium  and  low  sulfur.    The  coal  regions  are  shown  in  Figure  3.    Transportation  is  a 
substantial  part  of  the  costs  in  using  coal.    Within  the  model,  the  more  compact  the  coal  region,  the 
better  the  estimate  of  transportation  costs  from  the  coal  region  to  the  utility  or  demand  regions. 
As  a  consequence,  even  though  some  regions  such  as  Central  and  Southern  Appalachia  contain  the  same 
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categories  of  coal,  they  are  modeled  as  two  separate  regions  to  have  a  better  estimate  of  transportation 
costs.    To  further  refine  the  costs  of  shipping,  a  transshipment  network  is  used.    Coal  moves  from  the 
coal  region  to  a  collection  of  transshipment  nodes:    cities  such  as  Cincinnati,  New  Orleans  and  Atlanta. 
Instead  of  shipping  the  coal  to  the  demand  regions  from  the  transshipment  points,  it  is  shipped  to  a 
selected  set  of  cities  in  the  demand  region  with  the  demand  in  each  city  a  fixed  fraction  of  the  demand 
region  needs. 

FIGURE  3 

PIES  Coal  Supply  Regions 


There  are  12  oil  and  13  gas  regions  based  on  National  Petroleum  Council  (NPC)  regions  and  special 
Alaskan  Regions.    The  refinery  regions  are  Petroleum  Allocation  for  Defense  Districts  or  PADD's.  The 
crude  oil  (condensates,  etc.,   from  gas  regions)  are  moved  into  the  refinery  regions  from  the  oil  and  gas 
regions  by  pipeline  or  tanker.    And  the  six  product  groups  are  moved  from  the  refinery  regions  to  either 
the  utility  or  demand  regions. 

For  ease  in  modeling,  the  utility  and  demand  regions  are  the  same.    They  are  FEA  regions.    Unlike  the 
regions  for  supplying  other  forms  of  energy,  a  utility  region  may  serve  only  the  corresponding  demand 
region  except  for  the  shipment  of    hydroelectric  power  from  the  Northwest  to  California.    This  greatly 
simplifies  the  calculating  of  the  average  cost  of  electricity,  that  is,  the  price  of  electricity  to 
consumers. 
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SUPPLY  ACTIVITIES 

The  traditional  approach  used  by  economists  is  to  estimate  output  as  a  function  of  capital  and  labor 
without  serious  regard  to  the  resource  base  (see,  for  example  [2]^   This  is  inappropriate  here  because 
the  most  important  factor  affecting  the  supply  of  fuels  is  the  character  and  extent  of  the  resource  base. 
Rather  than  using  historical  time  series  data  and  statistical  techniques  to  directly  predict  future  raw 
material  and  product  availability,  operations  research-based  process  models  are  built  to  simulate  the 
actual  production  capabilities  given  the  resources  of  an  energy  sector. 

The  supply  models  are  used  to  construct  supply  curves  that  are  step-function  approximations  to  continous 
functions.    For  example,  each  step  of  the  coal  supply  curve  for  each  region  represents  the  annual  rate 
of  production  from  a  specific  mine  type  within  two  mine  classes,  surface  and  deep.    In  Figure  4  the 
lowest-cost  steps  on  the  coal  supply  curves  are  associated  with  existing  mines  or  mines  that  are  about 
to  be  opened.    Here  the  capital  is  sunk  or  mostly  sunk  and  the  mines  will  be  operated  as  long  as  the 
marginal  revenue  is  at  least  equal  to  the  operating  cost.    The  higher-cost  steps  ensure  the  capital 
recovery  necessary  for  opening  a  new  mine.    There  are  supply  curves  for  each  oil  and  gas  region  that 
distinguish  primary,  secondary  and  tertiary  production.    Different  crude  types  such  as  West  Coast  heavy 
and  Wyoming  Mix  are  distinguished  by  region.    Each  crude  type  is  produced  in  proportion  to  its  historic 
share  for  the  region. 

FIGURE  4 
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CONVERSION  ACTIVITIES 

The  electric  utility  and  refinery  sectors  of  the  PIES  system  are  embedded  directly  into  the  general 
equilibrium  model.    In  this  respect,  the  conversion  models  are  different  from  the  supply  models  for  coal, 
oil  and  gas  where  only  the  outputs  (supply  curves)  are  directly  included  in  the  general  equilibrium  model. 


The  key  to  modeling  electric  utilities  is  that  they  cannot  inventory  their  product  and  must  produce 
electricity  on  demand.    This  means  that  utilities  must  own  some  equipment  that  runs  most  of  the  time  and 
some  equipment  that  runs  only  during  peak  demand  periods.    The  demand  levels  for  electricity  during  a 
year  are  represented  by  the  load  duration  curve.    A  point  (x,  y)  on  the  load  duration  curve  in  Figure  5 
shows  that  for  x  hours  during  the  year  at  least  y  kilowatts  were  demanded.    This  curve,  for  modeling 
purposes,  is  divided  into  three  pieces:    base,  intermediate,  and  peak.    The  kinds  of  generation  equipment 
that  can  be  used  include  nuclear,  hydroelectric,  coal -fired  (with  and  without  scrubbers),  residual  fired, 
natural  gas  fired,  and  simple-cycle  and  combined-cycle  distillate  plants  as  well  as  new  technologies. 
Any  of  these  types  of  equipment,  other  than  nuclear,  can  operate  in  base,  intermediate,  or  peak,  whichever 
is  most  economic  for  the  utility.    Nuclear  operates  in  base  only. 


FIGURE  5 
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The  major  role  of  the  refinery  sector  is  to  reflect  the  appropriate  relative  prices  for  different  crude 
types  based  on  crude  oil  attributes  and  the  appropriate  relative  prices  for  the  four  product  classes 
based  on  product  characteristics.    What  should  happen  is  that  factors  such  as  the  quantities  of  the  four 
products  consumed  and  the  quantities  of  the  different  types  of  crude  oils  should  interact  in  the  refinery 
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sector  based  on  relative  demands  and  availabilities.    Crude  oils  with  greater  yields  of  the  more  heavily 
demanded  products  should  be  more  valuable  and  the  products  that  dominate  demand,  such  as  gasoline,  should 
be  more  expensive. 

A  primary  product  yield  from  each  crude  oil  is  estimated  using  historical  data.  Next,  by  parametrical ly 
varying  the  yields  of  a  large-scale  refinery  linear  program,  the  costs  of  shifting  the  product  slate  are 
determined.    The  result  is  a  model  that  is  an  extreme  point  representation  of  a  refinery. 

THE  INTEGRATING  MECHANISM 

The  solution  procedure  involves  inserting  a  step-function  approximation  to  the  demand  function  into  a 
linear  program  containing  the  supply  curves  and  the  models  of  conversion  activities.    The  approximation 
to  the  demand  function  ignores  the  effects  of  the  price  of  one  product  on  the  demands  for  other  products, 
e.g.,  only  the  natural  gas  price  affects  natural  gas  demand.    The  linear  program  is  then  solved  with  the 
objective  of  maximizing  the  area  under  the  difference  between  the  demand  and  supply  curves.    This  is 
mathematically  equivalent  to  finding  where  the  supply  and  demand  curves  intersect  (Figure  6).    How  close 
the  prices  and  quantities  are  to  being  on  the  demand  function  containing  the  cross  price  effects  is  then 
measured.    If  the  linear  programming  quantities  are  not  within  one  percent  of  the  demand  function  quanti- 
ties evaluated  at  the  prices  taken  from  the  linear  program,  the  equilibration  process  continues  with  a 
new  demand  demand  function  approximation.    The  demand  function  containing  the  cross  price  effects  is 
evaluated  at  an  average  between  the  prices  from  the  linear  program  and  the  previous  estimate  of  the 
equilibrium  prices;  and  a  new  approximation  to  the  demand  curve  is  constructed  around  this  point. 

FIGURE  6 
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In  the  past,  people  have  tried  to  find  economic  equilibria  by  just  inserting  a  single  demand  point  and 
successively  replacing  the  demand  quantities  with  new  values  from  the  demand  model  evaluated  at  the  dual 
variables.    The  dual  variables  seemed  to  oscillate  and  not  converge.    By  inserting  a  demand  curve 
approximation,  an  equilibrium  is  achieved  after  6  to  8  iterations  involving  the  solution  of  linear 
programs.    Currently,  we  vary  all  prices  the  same  percentage  simultaneously  in  calculating  the  step 
widths.    If  the  convergence  were  slower,  we  would  try  varying  all  prices  with  percentages  proportional 
to  a  trajectory  of  prices  from  successive  iterations.    Other  procedures  have  been  considered  but  not  tested. 
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THE  RELATION  OF  PIES  TO  GENERAL  EQUILIBRIUM  THEORY 

The  Neoclassical  model  of  exchange  may  be  described  as  follows.    Each  consuming  unit  i  for  i=l,...,  m  has 
an  initial  endowment  of  assets: 

Wi  =  (W]...,wi). 

At  a  price  vector  it  =  (tti  , . . .  ,TTp ) ,  consuming  unit  i  demands  a  vector  of  products  d-j(7r)  and  has  an  income 

■i      "  i  " 

of  P  =  Z     TT  W  .    For  convenience,  Z    it  =1  is  usually  added  as  a  requirement,  i.e.,  the  analysis  is  in 

j=l    j  j  j=l  J 

terms  of  constant  dollars.    Since  in  equilibrium  an  individual  cannot  spend  more  than  the  revenue  from  the 

sale  of  his  assets, 

Trd-j  (it)  =  ttW^  . 

This  leads  to  Walras  law: 

m  i 
ttZ      di(7r)  =  iT    Z  W, 
i=l  i=l 

m 

That  is,  the  monetary  value  of  what  is  demanded  equals  that  of  what  is  supplied.    Letting  g-j(T:)  =  Z 

J  i=l 

(d'^(TT)  -  wi)  be  the  excess  demand  function  for  asset  j,  for  j=l,...,n  an  economy  is  in  equilibrium  when 

gj(^)  1  0 

TTj    gj(TT)    =  0. 

That  is,  either  supply  equals  demand  or  supply  exceeds  demand  and  the  price  of  the  asset  is  zero.  When 

the  pure  exchange  economy  is  generalized  to  include  activities  for  the  conversion  of  one  asset  into  another 

using  linear  activities,  we  have  the  following  definition  of  a  competitive  equilibrium: 

Def ini tion  -  A  price  vector  it*  and  a  vector  of  activity  levels  y*  constitute  a 
competitive  equilibrium  if: 

a.  Supply  equals  demand  in  all  markets,  or  d(TT*)  =  By*  +  W 
where  B  is  the  matrix  of  possible  activities;  and 

b.  production  is  consistent  with  profit  maximization  in  the 
sense  that  Z  'T^.*b-jj50  with  equality  if  y*>0- 

Part  b  is  a  requirement  that  excess  profits  are  associated  only  with  rents  on  scarce  resources. 

In  PIES  the  assets  are  raw  materials  such  as  coal,  oil  and  biomass  as  well  as  electrical  generation 

equipment,  refineries,  pipelines,  etc.    Added  to  these  is  an  aggregate  asset  which  represents  capital  for 

new  equipment  and  for  developing  new  resources,  labor  and  other  nonenergy  resources.    The  components  of 

this  aggregate  asset  are  priced  in  terms  of  1975  dollars,  giving  us  the  objective  function  cost  coefficients 

in  the  linear  programming  subproblem.    The  number  of  assets  in  PIES  is  large  because  each  step  on  each 

regional  supply  curve  of  a  fuel  is  a  different  asset  in  the  exchange  economy  with  a  different  equilibrium 

price  (dual  variable  on  the  bound  row).    Every  bounded  variable  is  an  asset,  as  is  every  product  in  every 

demand  region.    In  PIES  an  economic  equilibrium  is  found  where  all  assets  are  priced  relative  to  the 

n 

aggregate  asset.    The  prices  may  then  be  normalized  to  achieve  Z  ^  ■'Tj=l  ■    This  leads  to  an  alternative 
interpretation  of  the  PIES  mechanism  for  searching  for  an  equilibrium.    The  goal  at  each  iteration  is  to 
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satisfy  Walras  law  with  the  dual  variables  as  prices  and  a  successively  improved  approximation  to  the 
demand  model . 


NONCOMPETITIVE  PHENOMENA  MODELED  IN  PIES 

There  are  three  areas  where  regulatory  actions  that  alter  the  structure  of  the  economy  are  modeled. 
These  are  the  average  cost  pricing  of  electricity,  current  oil  import  entitlements  and  interstate 
regulation  of  natural  gas. 

UTILITY  REGULATION 

Since  there  are  increasing  returns  to  scale  to  power  transmission  and  distribution,  the  electric  utilities 
constitute  a  natural  monopoly  which  must  be  regulated  in  some  fashion.    Currently,  public  utility  commissions 
regulate  utilities  to  provide  them  with  a  reasonable  rate  of  return  on  their  total  investment.    This  means 
that  customers  are  charged  the  average,  not  marginal,  cost  of  delivering  electricity.    The  cost  curves  for 
delivery  look  as  follows: 

FIGURE  7 
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Given  a  set  of  fuel  prices,  the  marginal  cost  curve  is  a  step  function  where  each  step  represents  bringing 
into  use  a  different  kind  of  generation  equipment.    The  higher  costs  are  due  to  using  less  efficient 
equipment  or  including  the  capital  costs  for  acquiring  new  equipment.    For  low  quantities  of  electricity 
the  average  cost  curve  is  higher  than  the  marginal  cost  curve.    This  is  because  the  cost  of  the  existing 
equipment  is  included  in  the  average  cost  of  electricity. 

The  prices  from  a  linear  program  are  marginal  prices.    Therefore,  for  the  demand  model  to  respond  to 
average  instead  of  marginal  prices  an  adjustment  must  be  made  to  the  linear  program.    The  approach  taken  is 
to  approximate  the  average  cost  curve  with  the  marginal  cost  curve.    This  is  done  in  the  following  manner. 
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Say  the  amount  of  electricity  demanded  in  a  given  region  at  the  end  of  a  linear  programming  step  is  B.  The 
difference  between  the  marginal  and  average  costs  of  electricity  is  C-D.    In  revising  the  linear  program 
for  the  next  iteration  the  transmission  cost  is  changed  so  that  the  marginal  cost  is  approximately  equal  to 
the  average  cost  at  B.    Letting  a  be  the  adjustments  to  the  true  transmission  cost  in  the  linear  program 
from  the  previous  iteration,  the  new  adjustment  is  (C-D  +  a)/2.    The  average  is  taken  for  smoothing  purposes. 

This  average  cost  pricing  mechanism  can  have  a  substantial  effect  on  convergence.    First,  if  the  quantity 
of  electricity  is  less  than  A,  then  a  decreasing  average  cost  curve  is  approximated  by  a  nondecreasing 
function.    The  result  is  a  perverse  behavior  where  the  amount  of  electricity  generated  decreases  at  each 
iteration  while  the  cost  increases.    We  have  also  experienced  another  form  of  non-convergent  behavior  because 
of  our  implementation.    We  have  seen  the  quantity  of  electricity  oscillate  between  two  steps  on  the  marginal 
cost  curve,  causing  the  correction  between  average  and  marginal  costs  to  fluctuate  and  not  stabilize.  This 
means  that  the  marginal  to  average  cost  adjustment  is  fluctuating,  leading  to  an  oscillation  in  prices  and 
quantities.    This  occurred  in  the  Northwest  where  there  was  a  big  jump  in  marginal  cost  in  going  from  hydro 
power  to  fossil  and/or  nuclear  power  as  the  marginal  source  of  electricity.    We  do  not  have  a  complete  n 
explanation  of  this  phenomenon.    Our  best  guess  is  that  it  has  to  do  with  the  behavior  near  the  minimum  of  o 
the  average  cost  curve  of  our  implementation  of  this  approach  to  average  cost  pricing. 

OIL  ENTITLEMENTS 

The  current  regulations  on  oil  production  require  that  the  average  price  of  domestically  produced  oil  be 
below  $8.00.    Since  the  marginal  price  of  oil  in  the  United  States  is  the  world  price  because  of  our  import 
dependence,  the  marginal  price  of  oil  products  would  be  refining  costs  plus  crude  oil  costs  at  the  world 
price  if  there  were  no  other  provisions  in  the  law.    The  regulations  specify  that  refiners  who  use  domestic 
oil  pay  $X  per  barrel  into  a  fund  from  which  refiners  who  use  imported  oil  receive  $Y  per  barrel.  This 
entitlement  to  the  users  of  imported  oil  makes  them  indifferent  between  domestic  and  imported  oils,  that  is, 


Pd  +  X  =  Pi-Y 

where  Pq  is  the  average  domestic  oil  price  and  Pj  is  the  import  price.  It  is  also  important  that  the  fund 
never  run  deficits  or  surpluses,  that  is, 

XQd  =  YQi 

where  Qq  is  the  domestic  production  and  Qj  is  imports.    There  also  are  entitlements  to  different  types  of 
domestic  oil.    There  is  a  legal  definition  of  what  is  called  "old"  oil  and  "new"  oil  with  "old"  oil  having 
a  substantially  different  price  from  "new"  oil. 

Entitlements  are  modeled  as  follows.    The  total  domestic  oil  production  given  the  regulations  is  estimated. 
The  supply  curves  for  oil  in  the  linear  programming  matrix  are  replaced  by  this  supply  point  and  the  cost 
in  the  objective  function  is  the  average  price  for  this  oil.    New  activities  are  added  to  the  matrix  that 
essentially  tax  domestic  oil  an  amount  X  and  give  a  credit  of  Y  to  imported  oil.    The  X  and  Y  are 
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recalculated  between  L.P.  iterations  using  the  above  equations.    In  equilibrium  we,  therefore,  have  a 
scheme  where  these  equations  are  satisfied  and  a  forecast  of  the  impact  of  this  form  or  regulation. 

NATURAL  GAS  REGULATION 

Currently,  there  are  two  markets  for  natural  gas:    the  unregulated  intrastate  markets  and  the  Federally 
regulated  sales  of  natural  gas  across  state  boundaries.    By  law  the  interstate  gas  price  is  based  on 
historical  costs;  this  has  lead  to  a  two  tier  market  and  has  developed  where  intrastate  gas  sells  at  a 
higher  price  than  interstate  gas.    Consequently,  the  only  new  sales  of  interstate  gas  are  coming  from 
onshore  areas  where  fields  are  close  to  interstate  pipelines  and  there  are  no  intrastate  pipelines  nearby 
or  from  the  outer-continental  shelf  which  is  under  Federal  jurisdiction.    The  supply  of  interstate  natural 
gas  in  a  given  year  can  be  estimated  by  taking  the  current  rate  of  production  and  reducing  it  by  the 
natural  decline  from  existing  wells  in  the  onshore  regions  and  adding  to  this  the  production  from  the 
outer-continental  shelf.    Since  the  customer  must  be  charged  the  average  cost  of  interstate  gas  (there  are 
three  price  levels  for  various  vintages  of  domestically  produced  interstate  gas  plus  the  costs  of  liquefied 
natural  gas  (LNG)  and  imports  from  Canada),  there  is  a  shortfall  in  supply  in  regions  where  there  is  little 
or  no  intrastate  gas  available. 

To  deal  with  the  shortfall  a  priority  scheme  has  been  developed  by  the  FPC  to  allocate  gas  to  states  by 
classes  for  a  given  pipeline.    Each  state  then  allocates  the  gas  available  from  the  pipeline.    Each  state 
then  allocates  the  gas  available  from  the  pipeline  to  customers  within  the  state  based  on  its  own  priority 
structure. 

The  modeling  approach  taken  is  to  assume  that  gas  customers  fall  into  two  distinct  groups:    those  with 
interstate  gas  and  those  that  must  use  intrastate  gas  or  a  bundle  of  other  fuels  with  the  bundle  containing 
such  fuels  as  electricity  and  distillate.    To  determine  the  customers  who  receive  interstate  gas,  the 
available  domestically  produced  gas  is  allocated  in  an  approximation  to  the  FPC  priority  scheme  across 
the  nation  rather  than  by  individual  pipeline.    The  priorities,  in  order,  are  residential,  commercial,  raw 
material  and  industrial.    After  a  region  receives  its  share  of  domestically  produced  gas,  imported  gas 
available  to  the  region  is  then  rolled  in  until  either  all  demand  is  satisfied  or  there  is  no  more  gas 
available.    Excess  demand  in  curtailed  sectors  is  satisfied  first  by  intrastate  supply  to  the  extent  it  is 
available  at  creditable  prices  and  then  by  a  displacement  or  switching  of  this  demand  to  other  fuels  in  a 
way  that  is  sensitive  to  other  fuel  prices,  sector-specific  end  use  efficiencies  and  historical  shares. 

Say  for  illustration  residential  demand  is  satisfied  but  only  a  of  commercial  demand  is  satisfied  and 
there  is  no  raw  material  or  industrial  demand  to  be  satisfied. 
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The  total  supply  to  all  commercial  customers  is 


FIGURE  8 


INTER-  INTRA-  SUBSTITUTE  BUNDLE  AND 
STATE         STATE  INTRASTATE  GAS 

GAS  GAS 


The  demand  for  natural  gas  is  the  sum  of  the  demand  satisfied  by  interstate  gas  at  the  interstate  gas 
price  plus  the  demand  for  intrastate  gas  and  the  substitute  fuel  bundle.    Ignoring  cross  elasticities  the 
demand  for  gas  is  Q  =  a       where  P  is  the  natural  gas  price,  e  is  the  elasticity  of  natural  gas  demand 
and  and  a  is  a  constant.    At  the  interstate  price  Pp;,  the  quantity  of  gas  demand  met,  Q,  is  aaPp.  This 
means  that  the  demand  curve  for  intrastate  gas  and  the  substitute  bundle  is  (l-a)aPp.    In  Figure  9  the 
demand  curve  for  gas  is  D'.    Because  the  inter-  and  intrastate  markets  are  separate,  the  total  demand  for 
natural  gas  is  represented  by  the  demand  curve  D  where  Q  =       +  (l-a)aP^.    That  is,  the  quantity  of 

intrastate  gas  and  the  substitute  fuel  bundle  demanded  at  P^  is  E  -  Q[^.    The  quantity  of  intrastate  gas 
consumed  is  A  -  Qr  and  the  quantity  of  demand  met  by  the  fuel  bundle  is  E-A.    The  price  of  the  gas  dis- 
placement bundle  is  a  sector-specific  market  price  constructed  from  its  fuel  components.    As  the  diagram 
shows,  this  activity  is  assumed  to  dominate  the  further  movement  up  the  interstate  supply  curve.  The 
assumption  which  justifies  this  is  that  the  process  model  of  fuel  substitution  on  the  demand  side  is  more 
reliable  than  the  econometric  model  at  very  high  prices. 

FIGURE  9 


Qr        A     E  Q 
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Note  that  the  total  quantity  of  gas  demanded  with  this  demand  curve  is  greater  than  the  quantity  demanded 
with  the  original  demand  curve  D'.    This  is  because  consumers  of  interstate  gas  see  the  low  interstate 
gas  price  and  not  the  higher  intrastate  price. 

COMPARISONS  OF  PIES  TO  OTHER  PROCEDURES  FOR  FINDING  A  COMPETITIVE  EQUILIBRIUM 

The  only  other  technique  for  finding  a  competitive  equilibrium  with  computational  results  involves 
discrete  approximations  of  the  simplex  of  prices  and  a  search  for  a  fixed  point.    The  fixed  point  approach 
of  Scarf  [3]  with  enhancements  by  Hanson  [3]  has  results  that  cannot  be  seriously  compared  with  PIES 
because  the  problems  solved  are  on  a  much  smaller  scale.    Appendix  2  in  [3]  gives  computational  experience 
for  Scarf's  algorithm.    He  estimates  that  computation  time  varies  as  m^.    On  an  IBM  360-50  he  estimates 
the  time  to  solve  a  15  asset  problem  to  be  about  2  minutes.    Extrapolating,  a  100  asset  problem  would 
then  take  more  than  2,000  minutes.    The  solution  step  in  PIES  with  100  products  consumed,  not  counting 
intermediate  products  or  raw  materials  which  are  in  the  thousands,  takes  from  20  to  30  minutes  on  a 
370-168  under  MVS  with  the  number  of  linear  programs  to  be  solved  ranging  from  6  to  8.    When  natural  gas 
regulation  is  in  the  model  the  number  of  iterations  increases  to  15-20.    The  reason  for  the  increase  in 
iterations  is  that  supply  of  intrastate  natural  gas  is  inelastic  in  many  regions  leading  to  large  price 
changes  for  little  quantity  change. 
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Abstract 

A  set  of  large-scale  programming  models  has  been  developed  and  quantitatively  applied  to 
analyze  land  use,  water  use,   environmental  restraints  and  impacts,  agricultural  and  food 
policies,  export  problems,   income  redistribution  patterns  and  other  facets  of  the  nation's 
agricultural  sector.     Perhaps  these  are  the  largest  operationally  useful  mathematical 
programming  models  in  existence.     The  models  incorporate  nine  land  class  restraints  in  each 
of  223  producing  regions.     They  include  51  water  resource  regions  with  corresponding  re- 
straints and  35  market  regions  with  demand  relations  for  all  relevant  endogenous  commodities. 
The  models  generate  results  at  national,  state,  regional  or  local  levels.     Their  results 
have  been  used  by  all  major  national  commissions  on  food  policy,  water  allocation,  and  land 
use . 


Many  applications  of  mathematical  programming 
models  have  now  been  made.     We  have  completed  and 
underway  a  set  which  is  rather  unique  in  its  size, 
scope  and  national  policy  uses.     While  we  employ 
both  linear  and  quadratic  specification,  main 
reliance  is  placed  on  linear  models.     These  na- 
tional and  interregional  models  applied  to  agri- 
culture and  natural  resources  have  been  under 
evolution  for  a  dozen  years.     This  time  period  has 
allowed  us  to  build  up  a  vast  data  bank  and  add 
many  dimensions  to  models.     At  various  stages  in 
their  development,   the  models  have  been  the  main 
quantitative  base  for  various  commissions  deal- 
ing with  national  policy.     They  have  served  this 
role  for  the  President's  Food  and  Fiber  Commis- 
sion [3],   the  National  Water  Commission  [4],  the 
Water  Resources  Council's  National  Water  Assess- 
ment [7],  and  the  Midwest  Governors  Conference  on 
Land  Use  [5]  and  for  numerous  applied  studies  for 
the  Environmental  Protection  Agency  [6],  the 
National  Water  Quality  Commission  [9],   and  the 
Soil  Conservation  Service  [10] . 

In  this  paper,  we  report  on  one  set  of  models 
capable  of  evaluating  water  and  land  use  and 
their  impacts  on  the  environment  at  both  national 
and  regional  levels  [8] .     While  we  apply  one  such 
model  to  evaluate  potentials  in  environmental 
improvement  through  controlling  soil  loss  or 
sedimentation  from  farm  land,   the  general  model 
set  is  capable  of  analyzing  many  other  facets  of 
resource  use  and  environmental  impacts  as  these 
relate  to  trade,  agricultural  or  other  policies. 
We  also  have  completed  models  which  incorporate 
legislative  or  price  restraints  to  attain  energy 
conservation  [1],   to  promote  environmental  quality 
through  reduced  use  of  nitrogen  fertilizer  and 
pesticides   [2],  which  enhance  the  environment 
through  stream  flow  regulations  to  protect  wild- 
life habitats  and  others. 


The  models  have  the  capacity  to  evaluate  simulta- 
neously variables  and  outcomes  in  (a) national  markets 
prices,  incomes  and  employment  and  (b) production  pat- 
terns, resource  use  and  economic  structure  of  rather 
small  resource  regions,  under  the  posed  imposition  of 
alternative  policies  or  futures.     We  believe  that 
models  with  these  characteristics  and  capabilities 
are  extremely  important  for  the  future  as  nation- 
al, state  and  local  entities  evaluate  and  consider 
implementation  of  environmental,  land  use,  water 
and  other  resource  and  technological  restraints 
related  to  problems  emerging  under  the  nation's 
advanced  state  of  economic  development.  Other- 
wise,  the  programs  and  policies  imposed  by  states, 
municipalities  and  regional  planning  bodies  will 
encounter  unexpected  economic  effects,  causing 
them  to  be  nullified  because  they  give  inequitable 
distributions  of  the  costs  and  benefits  of  the 
goals  attained.     For  example,  initial  solutions  M 
of  our  models  pose  the  certainty  that  individual  I 
states  which  impose  restrained  patterns  of  land  I 
use,  water  runoff,  sedimentation  and  technologies! 
will  find  that,  through  market  impacts,  producers! 
and  resource  owners  of  other  states  and  location 
will  realize  economic  gains  while  those  of  the 
imposing  state  will  bear  the  costs  in  lower  in-  m 
comes  and  reduced  resource  prices.     Even  for  cer- 
tain quality  controls  imposed  at  the  national 
level,  relative  returns  can  be  positive  in  some 
regions  and  negative  in  other  regions. 

Models  Reviewed 

The  models  reviewed  in  this  paper  encompass  the 
whole  of  U.S.  agriculture  and  the  land  and  water 
use  relating  thereto.     These  demand-allocation 
models  incorporate  all  major  agricultural  quality 
interaction  reflecting  restraints  in  resources 
for  223  agricultural  producing  regions  (Figure  1) , 
soil  characteristics  in  1,891  land  resource 
groups,  water  resources  for  51  water  supply 
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Figure  1.     The  223  producing  areas 


regions  (Figure  2) ,   in  the  17  Western  States,  and 
demand  or  commodity  balances  in  30  consumer  mar- 
ket regions  (Figure  3).     The  models,  which  in- 
corporate a  transportation  submodel  for  commodi- 
ties and  water  and  product  transfer  activities, 
allow  selection  of  optimal  resource  use  patterns 
and  environmental  quality  impacts  for  the  nation 
in  future  time  periods.     They  also  reflect  com- 
parative advantage  in  the  allocation  of  land  and 
water  to  competing  alternatives  as  represented  in 
relative  yields,  general  technologies,  environ- 
mentally restrained  technologies,  production 
costs,   transport  costs  and  imposed  environmental 
restraints.     They  allow  substitution  of  land  at 
one  location  for  water  at  another  location  a 
thousand  miles  away  (or  vice  versa).  Similarly, 
they  allow  and  analyze  these  substitutions  when 
environmental  restraints  are  applied  to  restrain 
the  technologies  used  in  any  one  resource  region. 
Finally,   they  allow  evaluation  of  various  policy 
alternatives  in  use  of  land  and  water  resources, 
and  environmental  quality  controls  in  interaction 
with  commercial  agricultural  policies,  export 
goals  and  domestic  demands  in  both  regional  and 
national  markets. 


Ob j  ectives 

Our  overall  objectives  in  building  these  models 
are  to  determine  (a)  whether  the  nation  has 
enough  land  and  water  resources  to  meet  domestic 
and  export  food  needs  when  various  environmental 
quality  restraints  are  imposed,    (b)   the  optimum 
spatial  allocation,   for  the  nation  and  internally 
for  each  individual  producing,   land  and  water 
region,  of  these  resources  accordingly,    (c)  the 
extent  to  which  sacrifices  must  be  made  in  envi- 
ronmental quality  goals  as  other  goals  (food 
prices,  exports,   treasury  costs,  farm  income, 
energy  use,   resource  values,   income  distribu- 
tion, etc.)  are  attained — or  vice  versa,   (d)  the 
cost  to  regions  and  the  nation  as  various  land 
use  patterns  and  environmental  quality  goals  are 
attained,    (e)   the  optimal  selection  among  alter- 
native producing  technologies  and  land  use  pat- 
terns,  for  each  region  and  for  the  nation,  as 
various  environmental  quality  restraints  are  im- 
posed, and  (f)  miscellaneous  impacts  including 
those  relating  to  farm  size  and  income,  the 
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Figure  3.     The  30  consumer  market  regions 


distribution  of  the  costs  and  benefits  of  these 
patterns  or  allocations,  employment  and  income 
generation  in  rural  areas. 

Environmental  quality  controls  based  on  runoff 
and  sediment  transport,   for  example,  will  have 
direct  impact  on  crops  produced  and  technology 
used  on  erodible  lands  (e.g.,  slope  of  land  and 
amount  and  intensity  of  rainfall).     However,  these 
environmental  restraints  also  will  have  "chain 
reactions"  in  optimal  land  use  in  regions  100  or 
1,000  miles  away  which  are  not  subject  to  runoff 
and  sediment  loss  since,   nationally,  a  new  con- 
figuration of  comparative  advantages  will  be  cre- 
ated in  relation  to  both  the  environmental  re- 
straints and  commodity  and  resource  markets. 
Land  with  characteristics  giving  rise  to  runoff 
and  sediment  loss  may  be  required  to  shift  to 
more  forages,   livestock  production  or  forest  prod- 
ucts— although  the  outcome  also  will  depend  on 
row  crop  yields,   costs  and  returns  under  mechani- 
cal erosion  control  practices.     In  the  "chain 
reaction"  or  regional  interdependence  relation- 
ships, land  at  one  distant  location  not  subject 
to  erosion  which  once  produced  cotton  may  now 
optimally  be  allocated  to  soybeans  to  meet  nation- 
al demands  while  at  a  different  location,  non- 
erodible  land  once  allocated  to  wheat  now  may  be 
specified  for  feed  grains  as  the  national  live- 
stock ration  and  export  demands  are  met.  Re- 
straints on  chemical  fertilizers  and  pesticides 
have  similar  remote  and  complex  interactions  in 
resource  use  among  the  many  different  land  re- 
gions of  the  nation.     Generally,   those  regions  of 
ample  rainfall  and  irrigation  water  will  be  shift- 
ed towards  a  less  intensive  use  of  land  while 
adapted  regions  with  less  runoff  and  relatively 
less  dependence  on  imported  technological  inputs 
will  tend  towards  more  intensive  land  use  (e.g., 
more  grain  and  less  forage  and  livestock  produc- 
tion) . 

Because  of  regional  interdependencies ,   it  is  im- 
possible to  plan  -nationally  efficient  uses  of 
land  and  environmental  quality  controls  on  a 
region-by-region  or  regionally  independent  basis. 
The  models  used  incorporate  interdependencies 
among  the  hundreds  of  land  resource  regions  and 
allow  for  both  direct  and  indirect  impacts  among 
regions  whether  they  are  contiguous  or  distant 


from  each  other.     Not  only  do  they  need  to  in- 
corporate these  interdependencies  among  land  re- 
source regions,  water  regions  and  market  regions, 
but  also  they  need  to  allow  them  among  resources 
and  commodities.     They  need  to  allow  substitution 
of  land  at  one  location  for  water  at  a  different 
and  distant  location,  since  a  policy  restraint 
which  limits  the  use  of  capital  technology  on 
land  at  one  location  can  be  offset  by  a  realloca- 
tion or  extended  use  of  water  at  another  loca- 
tion— or  vice  versa.     Great  flexibility  prevails 
in  the  national  livestock  ration,   the  major  de- 
mand determinant  in  national  land  use,  and  shifts 
in  or  restraints  on  land  use  can  allow  or  cause 
limits  on  grain  production  in  one  location  to  be 
replaced  by  soybean,   forage  or  a  substitute  feed 
grain  in  another  location.     An  efficient  land 
use-environmental  quality  model  needs  to  allow 
all  of  these  interdependencies  over  the  nation 
with  reflection  back  to  optimal  land  employment 
for  each  land  resource  region. 

Nature  of  Models 

To  illustrate  the  general  nature  of  the  models 
involved,  under  the  restraint  of  presentation 
space  and  time,  we  use  a  model  projecting  to  the 
year  2000  for  a  population  of  284  million,  free 
market  conditions  with  trend  levels  of  agricul- 
tural technology  in  each  of  223  agricultural  pro- 
ducing regions.     This  model,  only  one  in  a  series 
we  are  building,  emphasizes  optimal  land  use  pat- 
terns, agricultural  water  allocation,  agricultur- 
al technology  and  soil  conservation  methods  under 
environmentally  restrained  soil  loss.  (Subse- 
quent formulations  of  the  model  set  includes  en- 
vironmental limits  on  chemical  fertilizers,  pes- 
ticides and  livestock  wastes.)     The  objective 
function  in  equation  (1)  minimizes  the  cost  of 
producing  and  transporting  the  various  crop  and 
livestock  commodities  among  producing  and  land 
resource  regions  of  origin,  regions  of  process- 
ing, and  regions  of  consumption.     The  costs  allow 
the  system  to  consider  different  technologies 
(cropping  or  land  use  systems  and  mechanical 
pracpices)   in  restraining  soil  loss  to  alterna- 
tive environmental  quality  levels.     The  costs  of 
water  consumption  and  transfer  also  are  included 
in  equation  (1) .     The  programming  prices  and 
costs  cover  all  factor  costs  (except  land  rents 
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which  are  reflected  in  shadow  prices)  and  thus 
allow  simulation  of  a  long-run  market  equilibrium 
for  each  commodity  with  a  national  allocation  re- 
flecting the  comparative  advantage  of  each  of 
the  223  producing  regions,   subject  to  environment- 
al restraints  and  the  level  and  location  of  consu- 
mer demands.     The  objective  function  is  minimize 
OF  where: 

OF  =  X.,     UC,     +  lY.,     UC,     +  ZZ.,     UC,  ) 
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where  the  variables,  parameters,  and  other  terms 
are  defined  in  a  following  section. 

Restraints  and  variables 

Each  land  group  has  alternative  crop  management 
systems  producing  commodities  with  associated 
yields  and  soil  loss  subject  to  the  soil  types, 
average  weather  prevailing  and  the  conservation 
tillage  practices  utilized.     Data  were  developed 
in  conjunction  with  the  Soil  Conservation  Service, 
U.S.D.A.,   to  represent  soil  loss  per  acre  under 
various  mechanical  practices  and  rotations  or 
land  use  systems.     The  soil  loss  alternatives  are 
evaluated  through  the  .universal  soil  loss  equation 
(2) .     The  equation  used  for  each  crop  management 
system  is  of  the  form 

SL  =  K-L-S-R-C-P  (2) 

where 

SL  is  the  per  acre  gross  soil  loss; 
K  is  the  erodibility  factor  associated  with 

the  soil  type; 
L  is  the  computed  value  relating  slope 

length  to  soil  loss  control; 
S  is  the  derived  from  a  nonlinear  function 

relating  slope  gradient  to  level  of  soil 

loss; 

R  is  an  index  of  erodibility  for  the  rain- 
fall of  the  area  accounting  for  varying 
levels  of  intensity,  duration  and  measured 
rainfall; 

C  is  an  adjustment  factor  giving  an  index  of 
the  relative  ability  of  alternative  crop- 
ping patterns  to  reduce  soil  loss; 

P  is  an  adjustment  factor  to  account  for  the 
potential  soil  loss  reduction  from  adopt- 
ing conservation  practices. 

Each  activity  in  the  model  represents  an  alterna- 
tive crop  management  system  which  incorporates  a 
given  rotation,   crop  tillage  method  and  a  conser- 
vation practice  for  an  individual  land  resource 
group.     The  rotation  and  tillage  method  combine 
to  give  unique  C  value  and  the  conservation  prac- 
tice determines  the  P  factor.     The  K,  L,  and  S 
factors  are  dependent  on  the  soil  characteristics 
and  the  regional  rainfall  patterns  determine  the 
R  factor. 

Associated  with  the  alternative  crop  management 
systems  are  specific  per  acre  crop  costs  and  crop 


yields  for  each  region.     The  cost  data  reflect 
expenditures  on  machinery,  labor,  pesticides,  non- 
nitrogen  fertilizers  (nitrogen  is  balanced  endog- 
enous to  the  model) ,  and  miscellaneous  production 
items.     The  component  costs  reflect  different 
efficiencies  of  farming  resulting  from  working 
land  in  straight  rows,  contours,  strip  cropping, 
with  terraces,  under  minimum  tillage  or  under 
conventional  tillage.     The  alternative  costs  also 
reflect  the  higher  pesticide  requirements  and 
lower  machinery  and  labor  requirements  for  crops 
under  a  reduced  tillage  cultivation  pattern.  The 
costs  sum  to  an  aggregate  which  depends  on  the 
particular  cropping  management  system  and  when 
combined  with  the  outputs  from  the  system,  reflect 
the  comparative  advantage  of  each  system  on  each 
land  class  in  each  region. 

The  outputs  from  the  system  reflect  yields  of 
each  crop  and  the  associated  quantity  of  soil  loss 
per  acre  per  year.     The  interaction  within  the 
system  also  is  reflected  in  a  nitrogen  balance 
subsector  where  the  nitrogen  flows  in  the  model 
are  examined.     The  entire  cost  and  yield  section 
of  the  model  is  interlocked  with  alternative 
technologies,   levels  of  resource  input  and  alter- 
native input  uses  to  meet  domestic  and  export  de- 
mands.    The  nitrogen-fertilizer-crop  yield  section 
is  an  example  of  other  interrelationships  in  the 
model.     Nitrogen  available  in  each  region  is  an 
independent  variable  in  the  crop  yield  equation 
but  the  source  of  the  nitrogen  may  vary.     It  can 
be  supplied  from  chemical  fertilizers,  livestock 
wastes  and  from  nitrogen  fixation  by  legumes.  The 
livestock  wastes  available  are  dependent  on  the 
type  and  quantity  of  feed  available  for  livestock 
and  the  concentration  of  the  animals  in  the  re- 
gion.    Also  affecting  the  yields  of  the  crop  is 
the  land  class  on  which  it  is  grown  and  the  con- 
servation and  tillage  practice  associated  with 
the  cropping  management  system. 

Both  dryland  and  irrigated  crop  variables  are  in- 
cluded for  producing  regions  in  the  17  Western 
States  which  grow  irrigated  crops  and  the  model 
allows  selection  among  dryland  or  irrigated 
farming  for  each  region.     A  range  of  livestock 
rations  (variables)   is  allowed  in  all  223  produc- 
ing regions  since  the  least-cost  feed  mix  can  be 
drawn  from  various  grain,  forage  and  pasture  crops 
grown  in  the  region  or  imported  (where  allowed) 
from  others.     The  model  includes  variables  repre- 
senting various  cropping  systems  and  technologies 
affecting  soil  loss,   livestock  production,  commod- 
ity transportation,  water  transfers,  consumer 
demand  fulfillment  and  alternative  export  levels. 

Each  of  the  223  producing  regions  has  land  re- 
straints of  the  nature  indicated  in  equations 
3-9.     Each  region  has  a  soil  loss  restraint  as  in 
equation  (10) ,  a  nitrogen  balance  equation  as  in 
(11)  and  a  pasture  restraint  as  in  (23) .  Each 
of  the  51  water  supply  regions  has  a  water  re- 
straint as  in  equation  (13),  where  variables  and 
parameters  are  defined  subsequently. 

Each  of  the  30  consuming  regions  has  net  demand 
equations,   for  all  of  the  relevant  crop  and 
livestock  activities  as  illustrated  by  equation 
(14) .     Regional  consumer  demand  quantities  were 
determined  exogenously  from  geographic  and 
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national  projections  of  population,  economic 
activity,  per  capita  incomes  and  international  ex- 
ports through  the  region  for  2000.     National  de- 
mands were  defined  for  cotton  and  sugar  beets  as 
indicated  in  equation  (15) .     Poultry  products, 
sheep,  and  other  livestock  were  regulated  at  the 
consuming  region  level.     International  trade  was 
regulated  at  the  regional  levels  as  indicated  in 
equations  (16)  and  (17) . 

Commodities  included  in  the  endogenous  analysis 
are  soil  loss,  nitrogen,  water,  oorn,  sorghum, 
wheat,  barley,  oats,  soybeans,  cotton,  sugar 
beets,   tame  hay,  wild  hay,   improved  pasture, 
unimproved  and  woodland  pasture,   cropland  pasture, 
public  grazing  lands,   forest  lands  grazed,  all 
dairy  products,   pork,   and  beef.     Also  accounted 
for  prior  to  solution  of  the  model  are  other  crops 
including  fruits,  nuts,  vegetables,   rice,  flax 
and  broilers,   turkeys,   egg  production,  sheep  and 
other  livestock. 

Dryland  cropland  restraint,  each  region  by  land 
class : 


I  X.,     a.,      <  LD., 
ikm     ikm  =  ik 

m 


(3) 


Irrigated  cropland  restraint,  each  region  by  land 
class : 

I  Y.,     a.,     +  I  Z.,      a.,      <.  LR.,  (4) 

ikn     ikn  ikm    ikm  —  ik 

n  m 


Dryland  wild  hay  restraint,   each  region: 


-  IPP.   f.  -  DWH.   f.  -  IWH.   f.  -  FLG.   f.  =  0  (11) 
11  11  11  11 

Pasture  use  restraint,  each  region: 

1(1  X.,  r.,  +  I  Y.,  r.,  +  L  Z.,  r.,  )  +  DPP. 
,         ikm     ikm  ikn     ikn  ikm     ikm  i 

km  n  m 


r.  +  IPP.  r.  +  FLG.  v.  -  I  L.  q. 

1  11  11  ip  ID 

P 

-  EL.   q.   >.  0  (12) 
11- 


Water  use  restraint,  by  water  region: 


WB    +  WT    +  VJI     -  WO     -  WX    -  WE    +  WD    -  I 
w-w-w  w  w  w  w. 

lEW 


IWH.  d.  -     I     IPP.  d.  -  I     I     (I  X.,      d.,  + 
11.  11,.  ikm  ikm 

i£V7  k  lew  m 


I  Y.,     d.,     +  Z  Z  F.,     d.,    )  -     I     I  L.  d. 

ikn     ikn  ikm     ikm        .  ip  ip 

n  m  lEw  p 

-     I     PN.  d.  i  0  (13) 
1     1  — 

1£W 


Commodity  balance  restraint,  each  consuming 
region: 

I     I     (l  X.,      cy.,       +  I  Y.,      cy.,       +  I  Z., 
,    .    .  ikm      ikmc  ikn      iknc  ikm 

k  i£j     m  n  m 

cy.,      )  +    I     E  L.  cy.       +    Z  +  E .     -  E 

ikmc    -   .    .         ip     ipc  -  ^   .     tc  -     ic       .  . 

lej  p      ^  tej  lej 


DWH.  a.  <  ADWH, 
1     1  —  1 


(5) 


PN.   cy.     -  EL.  cy.      ^>  0 

1      ic  J       jc  - 


(14) 


Irrigated  wild  hay  restraint,  each  region: 


IWH.  a.  <  AIWH. 
11=  1 


(6) 


Dryland  permanent  pasture  restraint,  each 
region: 


DPP.  a.  <  ADPP. 
1     1  —  1 


(7) 


Irrigated  permanent  pasture  restraint,  each 
region: 


IPP.  a.  <  AIPP. 
1     1  —  1 


(8) 


Forest  land  grazed  restraint,  each  region: 


FLG.  a.  <  AFLG. 
1     1  —  X 


(9) 


Soil  loss  restraint,  each  region,  each  land 
class,  each  activity: 


SL.,    .     <,  ASL.,  , 
ikm+n  —  ikm+n 


(10) 


National  commodity  balance  restraints,  for  cotton 
sugar  beets  and  spring  wheat: 

Z  Z(Z  X.,     cy.,       +  Z  Y.,      cy.,       +  Z  Z.,  cy.,. 
.  ,         ikm      ikmg  ikn      ikng  ikm  ikmg 

ikm  n 


Z  PN.  cy.     -  EX    >  0 

i      1      ig  c  = 


National  export  restraints: 


Z  E.     >  EX 

1  c  —  c 


National  import  restraints: 


Z  E.    ,     <  IM  , 
ic+e  —  c+e 

1 


Non-negativity  restraints: 


(15) 


(16) 


(17) 


X.,    ,Y.,    ,Z.,    ,L.    ,   DWH.,   IWH.,  DPP.  IPP., 
ikm      ikn      ikm      ip  i  i  i,  i 

FLG . ,   FP . ,   EL . ,  WB   ,  WT  ,  WI  ,  WD  ,  WX  ,  WE  ,  PN . 
i'       i'       i'       w'      w'       w'       w'       w'       w  1 


Nitrogen  balance  restraint,  by  region: 


FP.  +  Z  b.     L.     +  EL.,..  -  EC.^.  -  Z(Z  X., 
1  ip     ip  ibi  ifi      ,  ikm 

p  km 

f.,     +  Z  Y.,     F.,^    +  Z  Z.,      f.,    )  -  DPP.  f. 
ikm  ikn    ibn  ikm    ikm  i  i 

n  m 


,  E.  ,  E.  >  0 
tc      jc      icre  = 


(18) 


The  subcripts  and  variables  for  the  above  equa- 
tions are  defined  in  the  section  below. 
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Subscripts  and  variables  of  the  model 

The  subscripts  and  variables  relating  to  the  equa- 
tions in  the  text  are  those  which  follow: 
subscripts 

c  =  1,2,..., 15  for  the  endogenous  commodi- 
ties in  the  model, 

e  =  1,2,..., 5  for  the  exogenous  livestock 
alternatives  considered, 

g  =  1,2,3  for  the  commodities  balanced  at 
the  national  level, 

i  =  1,2,..., 223  for  the  producing  areas  of 
the  model, 

j  =  1,2,..., 30  for  the  consuming  regions  of 
the  model, 

k  =  1,2,..., 9  for  the  land  classes  in  each 

producing  area, 
m  =  1,2,..., for  the  dryland  crop  management 

systems  on  a  land  class  in  a  producing 

area, 

n  =  1,2,..., for  the  irrigated  crop  management 
systems  on  a  land  class  in  a  producing 
area, 

p  =  1,2,..., for  the  livestock  activities  de- 
fined in  the  producing  area, 

t  =  1,2,..., 458  for  the  transportation  routes 
in  the  model, 

w  =  1,2,..., 51  for  the  water  supply  regions 
in  the  model. 

variables 

a,  the  amount  of  land  used  by  the  associated 
activity  from  the  land  base  as  indicated 
by  the  subscripts; 

AIPP,   the  number  of  acres  of  irrigated  permanent 
pasture  available  in  the  subscripted  pro- 
ducing area; 

ADWH,   the  number  of  acres  of  dryland  wild  (non- 
cropland)  hay  available  in  the  subscripted 
producing  area; 

AFLG,   the  number  of  acres  of  forest  land  avail- 
able for  grazing  in  the  subscripted  pro- 
ducing area; 

ADPP,   the  number  of  acres  of  dryland  permanent 

pasture  available  for  use  in  the  subscript- 
ed producing  area; 

AIWH,   the  number  of  acres  of  irrigated  wild  (non- 
cropland)  hay  available  in  the  subscripted 
producing  area; 
ASL,   the  per  acre  allowable  soil  loss  subscript- 
ed for  land  class,  producing  area  and 
activity; 

b,  the  units  of  nitrogen  equivalent  fertilizer 
produced  by  livestock,  subscripted  for  pro- 
ducing area  and  activity; 

cy,   interaction  coefficients  (yield  or  us-i)  of 
the  relevant  commodity  as  regulated  by  the 
associated  activity  and  specified  by  the 
subscripts; 

d,  the  per  unit  of  activity  water  use  coeffi- 
cient as  regulated  by  the  associated  activ- 
ity and  specified  by  the  subscripts; 

DPP,   level  of  use  of  dryland  permanent  pasture 
in  the  subscripted  producing  area; 

DWH,   level  of  use  of  dryland  wild  (non-crop- 
land) hay  in  the  subscripted  producing 
area; 

E,  level  of  net  export  for  the  associated 
commodity  in  the  associated  region  as 
specified  by  the  subscripts; 
EC,  level  of  exogenous  crop  production  by 


subscripted  region; 

EL,   level  of  exogenous  livestock  production  con- 
sistent with  the  subscripted  region; 

EX,  the  level  of  national  net  export  for  the  sub- 
scripted commodity  as  determined  exogenous 
to  the  model; 
f,   the  units  of  nitrogen  equivalent  fertilizer 
required  by  the  associated  activity  and  speci- 
fied by  the  subscripts; 
FLG,  level  of  forest  land  grazed  in  the  subscripted 
producing  area; 

FP,  number  of  pounds  of  nitrogen  equivalent  fer- 
tilizer purchased  in  the  subscripted  producing 
area ; 

IM,  level  of  national  net  imports  for  the  sub- 
scripted commodities  as  determined  exogenous 
to  the  model; 

IPP,  level  of  use  of  the  irrigated  permanent  pas- 
ture in  the  subscripted  region; 

IWH,  level  of  use  of  the  irrigated  wild  (non-crop- 
land) hay  in  the  subscripted  region; 
L,  level  of  the  livestock  activity  with  the  type 
and  region  dependent  on  the  subscripts; 

LD,  number  of  acres  of  dryland  cropland  available 
for  use  as  specified  by  the  region  and  land 
class  subscripts; 

LR,  numier  of  acres  of  irrigated  cropland  avail- 
able for  use  as  specified  by  the  region  and 
land  class  subscripts; 

PN,  level  of  population  projected  to  be  in  the 
subscripted  region; 
q,  units  of  pasture,   in  hay  equivalents,  consumed 
by  the  associated  livestock  activity  and  spec- 
ified by  the  subscripts; 
r,  units  of  aftermath  or  regular  pasture,  in  hay 
equivalents,  produced  by  the  associated  crop- 
ping or  pasture  activity  and  identified  by 
the  subscripts; 

SL,   level  of  soil  loss  associated  with  any  activi- 
ty over  the  range  m+n  in  the  region  and  land 
class  designated  by  the  subscripts; 
T,  level  of  transportation  of  a  unit  of  the 

commodity  either  into  or  out  of  the  consuming 
region  designated  by  the  subscripts; 

WB,  level  of  water  purchase  for  use  in  the  water 
balance  of  the  water  supply  region  designated 
by  the  subscript; 

WD,  level  of  desalting  of  ocean  water  in  the  water 
supply  region  designated  by  the  subscript; 

WE,  level  of  water  to  be  exported  from  the  water 
supply  region  subscripted; 

WI,  level  of  movement  of  water  in  or  out  of  the 
water  supply  region  through  the  interbasin 
transfer  network; 

WO,  level  of  water  requirement  for  onsite  uses 
such  as  mining,  navigation  and  estuary  main- 
tenance in  the  water  supply  region  subscript- 
ed; 

WX,   level  of  water  use  for  the  exogenous  agricul- 
tural crops  and  livestock  in  the  water  supply 
region  subscripted; 
X,  level  of  employment  of  the  dryland  crop  man- 
agement system,  rotation,   in  the  region  and 
on  the  land  class  as  designated  by  the  sub- 
scripts ; 

Y,  level  of  employment  of  the  irrigated  crop  man- 
agement system,  rotation,   in  the  region  and 
on  the  land  class  as  designated  by  the  sub- 
scripts; 

Z,  level  of  employment  of  the  dryland  crop  man- 
agement system,  rotation,  on  the  land  class 
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in  the  region  as  designated  by  the  subscripts  when 
the  land  has  been  designated  as  available  for 
irrigated  cropping  patterns. 

Illustration  of  Results 

Solution  of  the  model  provides  indication  of  opti- 
mal land  use  in  each  producing  region  or  each  of 
the  1,891  land  resource  groups  at  prescribed 
levels  of  environmental  quality  restraints,  con- 
sumer demand  and  distribution,  export  levels  and 
other  policy,  or  market  and  technology  parameters. 
Our  illustration  is  in  the  case  where  the  only 
environmental  restraint  is  soil  loss.     It  also 
designates  the  level  of  production  in  each  re- 
gion and  the  optimal  flows  of  commodities  to 
consuming  regions  and  export  markets.     For  pur- 
poses of  illustration,  we  refer  to  solutions  where 
(a)  soil  loss  is  not  restricted  and  (b)   soil  loss 
is  restricted  to  5  tons  per  acre  per  year  for 
each  of  the  1,891  soil  resource  groups  and  exports 
are  at  a  modest  level.     While  land  use  could  be 
mapped  or  indicated  by  each  of  the  1,891  land 
resource  groups,  we  illustrate  on  the  basis  of 
the  223  producing  regions  only.     The  model  indi- 
cates not  only  land  devoted  to  each  crop  use  in 
each  region  and  group,  but  also  can  indicate 
technologies  for  each  such  as  dryland  or  irrigat-^ 
ed,  alternative  rotations,   conventional  or  reduced 
tillage  methods  and  others  which  affect  land  and 
water  use  and  sedimentation. 

Figure  4  indicates  an  optimal  distribution  of  row 
crop  acreage  among  the  223  producing  regions. 
Figure  5  indicates  an  optimal  distribution  of  the 
close  grown  crops  and  Figure  6  gives  the  hayland 
distribution  when  no  restraints  are  placed  on 
soil  loss  or  chemical  nitrogen  use. 


for  seven  major  geographic  regions  of  the  U.S. 
because  of  time  and  space  restraints.     While  solu- 
tions of  the  model  were  made  for  several  soil 
loss,  export  and  nitrogen  restriction  levels,  we 
similarly  summarize  solutions  only  for  two  soil 
loss  levels,  one  export  level  and  unrestrained 
nitrogen  use  (except  for  nitrogen  balance  within 
a  producing  region) . 

Restricting  soil  loss  per  acre  to  five  tons  would 
distribute  land  use  and  technologies  interregion- 
ally  to  reduce  national  soil  loss  to  727  million 
tons.     VJithout  the  restriction,  interregional 
land  use  allocations  and  technologies  to  meet 
export  demands  would  generate  a  national  soil  loss 
3.5  times  greater,  or  2,677  million  tons.     As  Table 
1  indicates  the  reduction  in  average  per  acre  soil 
loss,  as  a  source  of  sedimentation,  would  be  ex- 
tremely large  on  land  classes  V-VIII  which  are 
most  erosive.     While  we  do  not  do  so  here,  our 
models  allow  indication  of  soil  loss  changes  by 
each  individual  region. 

Regional  variation  in  reduced  soil  loss  per  acre 
is  great.     Largest  reductions  take  place  in  the 
South  Atlantic  (18.2  tons  per  acre)  and  South 
Central  (11.5  tons  per  acre)  regions  where  land 
and  current  land  use  methods  give  rise  to  serious 
erosion  (Table  2) .     The  reduction  in  soil  loss 
when  a  5  ton  per  acre  limit  is  imposed  is  attained 
especially  by  a  switch  from  conventional  tillage- 
straight  row  farming  to  contour,  strip-cropping 
and  terraces  (Table  3) .     There  also  is  a  signifi- 
cant shift  to  reduced  tillage  farming  practices 
to  attain  the  environmentally  attained  soil  loss 
of  five  tons  per  acre.     Acres  receiving  reduced 
tillage  practices  increase  from  around  21  million 
in  the  unrestricted  solution  to  near  58  million 


Figure  4.     Location  of  dryland  and  irrigated  row  crops 
with  no  soil  loss  restriction  in  2000. 


Soil  loss 

While  land  use,   tillage  methods  and  soil  loss  are 
generated  in  the  models  by  producing  regions  and 
land  resource  groups,  we  summarize  results  only 


acres  under  the  five-ton  solution.  Conventional 
tillage  practices  decline  from  248  million  acres 
under  the  unrestricted  soil  loss  to  201  million 
acres  under  the  solution  for  a  five-ton  soil 
loss.     Within  the  conventional  tillage  group. 
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Irrigated  •  d 

Figure  5.     Location  of  dryland  and  irrigated  close  grown 
crops  with  no  soil  loss  restriction  in  2000. 


Dryland 
Irrlgited 


Figure  6.     Location  of  dryland  and  irrigated  hay  with  no 
soil  loss  restriction  in  2000. 


Table  1.     National  soil  loss  total  and  average  per  acre  by  land  resource 
groups  for  two  levels  of  soil  loss  restriction,  2000. 


Item 

Unrestricted 

5  ton 

soil  loss 

soil  loss 

Total  tons  (million  ton) 

2677 

727 

Average  tons  per  acre 

class  I  6«  II  land 

6.2 

2.7 

HIE  &  IVE  land 

17.8 

3.  1 

other  III  &  IV  land 

15.6 

2.8 

V  -  VIII  land 

28.5 

1.5 

national  average 

9.9 

2.8 
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Table  2.     Average  per  acre  soil  loss  by  major  region  for  two  levels  of  soil 
loss  restriction  models,  2000. 


Region 

5  ton 

soil  loss 

soil  loss 

National 

9.9 

2.8 

Nortn  AC lanu ic 

9  0 

3.5 

S ou th  Atlantic 

21 . 5 

3.3 

Nortn  Lentrai 

2.  8 

South  Central 

15.1 

3.6 

Great  Plains 

3,2 

1.5 

North  West 

2.3 

1.7 

South  West 

3.3 

2.6 

Table  3.     Thousand  acres  of 
practices  for  two 

cultivated  land  by  conservation 
levels  of  soil  loss  restriction, 

-  tillage 
2000. 

Conservation 
tillage 

Unrestricted 
soil  loss 

5  ton 
soil  loss 

Conventional  tillage 

247,894 

201,238 

straight  row 

233,475 

129,120 

contoured 

11,254 

37,116 

strip  cropped  &  terraced 

3,165 

35,002 

Reduced  tillage 

21,219 

57,644 

straight  row 

21,219 

24,822 

contoured 

0 

18,902 

strip  cropped  &  terraced 

0 

13,920 

straight-row  farming  is  nearly  halved.  Contour- 
ing is  tripled  and  strip  cropping-terracing  prac- 
tices are  increased  1,000  percent  to  meet  soil 
loss  restrictions  (Table  3) .     While  reduced  till- 
age nearly  triples  and  very  large  increases  occur 
in  contouring,  terracing  and  strip  cropping, 
straight-row  methods  of  reduced  tillage  do  not  in- 
crease importantly. 

The  shift  in  acreages  (Table  4)   is  partly  hidden 
in  the  reduction  of  16^5  million  acres  used  for 
grain  crops  and  a  corresponding  increase  of  only 
5.5  million  acres  in  hay  on  cultivated  lands 
(Table  4) .     Part  of  the  production  required  to 
meet  national  demands  comes  from  an  increase  in 
noncropland  roughage  production  (permanent  hay 
and  pasture) .     More  of  the  reduced  acreage  re- 
quired to  meet  the  demand  for  agricultural  products, 
results  because  of  the  shift  in  production  to  the 
higher  cost  and  higher  yielding  erosion  control 
practices.     Also  a  shift  in  acreage  between  land 
classes  puts  the  grain  crops  on  the  higher  yield- 
ing and  less  erosive  lands. 

Costs  of  production,   in  conjunction  with  the 
transportation  network  and  the  soil  loss  restric- 
tions imposed,  determine  the  national  equilibrium 


prices  for  the  commodities.     Table  5  indicates  the 
relative  equilibrium  prices  of  the  commodities 
generated  by  the  model  under  the  two  levels  of 
soil  loss  and  a  single  export  level.     Soil  loss 
restrictions  have  the  largest  effect  on  prices  for 
commodities  which  concentrate  on  land  with  high 
soil  loss  potential.     Compared  to  absence  of  soil 
loss  restrictions,  cotton  and  soybean  prices  in- 
crease over  20  percent  while  wheat  and  hay  crops 
increase  by  less  than  10  percent.     The  increase 
in  grain  prices  result  in  corresponding  increases 
in  cattle  prices.     In  evaluating  the  effect  of 
any  environmental  policy  alternative,  the  effect 
on  the  desired  parameter  and  the  change  in  farm 
price  of  agricultural  products  are  two  easily 
observed  changes  in  our  models. 

Changes  summarized  at  the  national  level  do  not, 
of  course,  reflect  the  effects  in  particular  re- 
gions and  on  individual  enterprises.     These,  how- 
ever, are  all  available  from  our  models.  The 
shift  in  production  from  one  region  to  another  re- 
sults in  income  repercussions  on  the  rural  commu- 
nity affected.     The  effect  of  such  a  shift  depends 
on  the  degree  of  multiple  level  resource  use. 

The  data  in  Tables  1-5  indicate  that  American 
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Table  4.     National  production  of  row  crops,  close  grown  crops  and  rotation 
roughage  crops  for  two  levels  of  soil  loss  restriction,  2000.^ 


Land  use 

soil  loss 

5  ton 
soil  loss 

Acres  cultivated  (000) 

269,113 

258,882 

Row  crops  (000) 

148,226 

136,035 

Close  grown  crops  (000) 

75,535 

73,478 

Rotation  roughage  crops  (000) 

45,352 

49,369 

Non-rotation  roughage  crops  (000) 

303,060 

310,697 

Demand  levels  are  based  on  projected  per  capita  food  consumption  level 
284  million  people  in  2000  and  international  trade  of  grains  equal  to  the 
1969-1971  annual  averages. 


Table  5. 

Relative  farm 

level  prices  for  some 

agricultural 

commodities  with 

two  levels  of 

soil  loss  restriction 

2000. 

Commodity 

Unrestricted 
soil  loss 

5  ton 
soil  loss 

Corn 

100 

107 

Wheat 

100 

103 

Soybeans 

100 

115 

Cotton 

100 

112 

Hay 

100 

101 

Cattle 

100 

104 

Hogs 

100 

105 

Milk 

100 

100 

agriculture  has  great  capacity  and  flexibility  in 
adapting  to  certain  environmental  quality  goals. 
By  shifting  land  use  among  the  many  producing  re- 
gions and  land  resource  groups  in  terms  of  their 
comparative  advantage  in  yields,   commodity  costs, 
location,  and  transportation,  national  and  region- 
al demands  can  be  met  without  large  increases  in 
food  prices  and  costs  for  consumers  at  the  export 
level  examined.     The  level  of  exports  per  se  may 
have  greater  impact  on  consumer  food  costs  than 
does  a  relatively  wide  adaptation  of  agriculture 
and  land  use  to  environmental  quality  goals.  We 
will,  however,  provide  quantitative  analysis  of 
these  possibilities,  along  with  other  environ- 
mental quality  practices,   in  upcoming  presenta- 
tions . 
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I  INTRODUCTION 

Human  diet  problems  fall  into  two  major  cate- 
gories known  as  food  planning  and  meal  planning 
problems.     Both  problem  areas  have  been  shown  to 
be  amenable  to  mathematical  definition,  formulation 
and  solution. 

Food  planning  is  concerned  with  decisions  as 
to  which  food  entity,  and  how  much,  to  purchase 
subject  to  given  budgetary,  nutritional,  and  accep- 
tability requirements.     The  first  statement  of  this 
problem  in  the  context  of  the  cost  of  subsistence 
is  due  to  Stigler  [17].     This  was  later  reformulat- 
ed by  Dantzig  [10]  as  the  classical  example  of  a 
linear  programming  model ,  and  was  refined  later 
with  respect  to  consumer  acceptance  by  Smith  [15]. 
The  concept  is  still  in  use  in  terms  of  food  groups 
in  connection  with  USDA  family  food  plans  [15]. 

Meal  planning  is  a  decision  problem  of  find- 
ing an  optimum  sequence  of  meals  consisting  of  com- 
binations of  prepared  foods,  called  menu  items,  to 
be  eaten  by  a  person  or  a  population,  such  that  the 
required  structure  of  the  meals  and  given  budgetary, 
nutritional  and  food  production  specifications  are 
met.     The  entities  of  meal  planning  are  menu  items 
of  known  portion  size,  with  the  food  ingredients 
per  portion  defined  by  the  respective  recipes. 
This  problem  was  first  identified  and  solved  as  a 
mathematical  programming  problem  by  Balintfy  [2]. 
The  first  approach  and  some  of  the  later  refinements 
[3]  were  considering  one  meal  (or  day)  at  a  time, 
using  what  is  now  called  a  multistage  menu  schedul- 
ing algorithm.     Each  meal  was  a  least-cost  (best 
buy)  combination  of  items  selected  on  the  basis  of 
avoiding  incompatibility  of  items  between  meals  and 
within  meals.     The  former  rule  was  effected  by 
requiring  a  minimum  separation  of  meals  between 
consecutive  appearance  of  the  same  item  or  the 
same  kind  of  item  on  the  schedule.     The  latter 
rule  was  observed  by  restricting  items  from  the 
same  attribute  class  to  occur  in  the  solutions. 

The  concept  of  minimum  separation  of  items 
was  not  only  useful  in  menu  scheduling  as  a  safe- 
guard for  variety,  i.e.,  acceptability,  but  it 
could  also  be  used  to  establish  upper  bounds  on 
the  frequency  of  items  in  a  given  time  period. 
This  realization  led  to  the  development  of  a 
bounded  linear  programming  model  to  meal  planning 
[5]  which  defined  in  a  single  stage  the  optimum 
(least  cost)  frequencies  of  menu  items  for  a  period. 


called  a  menu  plan,  which  later  could  be  scheduled 
into  a  sequence  of  meals. 

Initially,  both  versions  of  these  modeling 
approaches  to  meal  planning  had  cost-minimizing 
objective  functions  and  assured  acceptability  oper 
ationally  only  by  variety  and  entry  restrictions. 
The  cost  minimization  objective  was  selected  pri- 
marily to  show  the  economic  impact  of  mathematical 
optimization  as  opposed  to  conventional  methods. 
Another  rationale  for  cost  minimizing  was  the 
paucity  of  data  and  methodology  to  represent  food 
preferences  quantitatively  as  meal  planning  object 
ives.     Indeed,  cost  savings  from  10-30  percent  of 
food  cost  have  been  achieved  in  a  variety  of  appli 
cations   [9,11] . 

These  applications  were  initiated  in  hospi- 
tals, although  a  variety  of  other  institutional 
feeding  programs,  such  as  school  lunch  service, 
college  food  service,  as  well  as  nursing  home, 
detention  home,  and  military  food  service  opera- 
tions also  could  utilize  a  better  approach  than 
the  prevailing  conventional  method  of  menu  plan- 
ning.    Such  methods  cannot  take  into  consideration 
explicitly  and  quantitatively  the  population  pref- 
erences, the  nutrient  composition  and  the  cost  of 
menu  items,  and  hence  the  resulting  conventional 
menus  are  ipso  facto  suboptimal  and  often  infeas- 
ible  relative  to  the  stated  objectives  and  con- 
straints  [3]  . 

The  role  of  food  preferences  has  been  long 
recognized,  and  food  preference  and  preferred 
serving-frequency  data  have  been  routinely  collect 
ed  in  the  food  service  industry.     Such  information 
was,  however,  utilized  only  subjectively  in  menu 
planning,  mostly  because  the  functional  relation 
between  the  preference  for  an  item  and  its  serv- 
ing frequency  was  not  recognized.     Benson  [8]  was 
the  first  who  represented  this  relation  by  fitting 
data  to  a  polynomial  function,  but  no  attempt  was 
made  to  use  his  results  either  in  menu  planning  or 
in  mathematical  programming  formulations. 

Major  developments  in  the  last  years  have 
taken  place  in  the  mathematical  modeling  of  food 
preferences  by  the  discovery  of  the  existence  of 
time-related  preference  functions.     The  impact  of 
this  new  development  on  the  modeling  of  meal  plan- 
ning decisions  as  mathematical  optimization  prob- 
lems is  investigated  in  the  sections  that  follow. 
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II    MATHEMATICAL  FORMULATION  OF  FOOD  PREFERENCE 
FUNCTIONS 

It  is  assumed  -  as  is  observable  in  reality  - 
that  most  food  is  consumed  in  discrete  portions  at 
discrete  points  in  time.     Consequently,  one  can 
relate  a  measure  of  satisfaction  with  the  event 
that  a  fixed  quantity  of  food  is  consumed  at  a 
given  time.     One  can  also  assume  that  for  foods 
that  are  familiar  and  are  more  or  less  routinely 
consumed  by  an  individual,  satisfaction  with  foods 
is  not  only  experienced  but  also  anticipated  to  such 
a  degree  that  a  measure  of  utility  can  be  elicited 
by  collecting  preference  ratings  for  a  set  of  foods 
on  some  centered  scale.     In  this  investigation, 
therefore,  it  will  be  assumed  that  preference  rat- 
ings are  estimates  of  the  measure  of  satisfaction 
of  an  individual  with  a  food.     The  word  food  is 
used  here  as  a  collective  term  applicable  to  the 
special  cases  of  both  food  groups  and  menu  items 
as  the  case  may  be. 

Let  h(t)  be  the  preference  rating  of  an  indi- 
vidual for  a  given  food  item  at  time    t    where  t 
is  measured  from  the  last  time  when  the  item  was 
consumed.     Clearly  h(t)  is  a  function  for  which  the 
following  properties  can  be  postulated: 

(a)  h(0)  =  -■»;  since  zero  time  interval 
between  eating  a  fixed  portion  of  a  food 
is  impossible  as  well  as  intolerable. 

(b)  hC"")  =  a;  where  "a"  is  a  positive  con- 
stant expressing  the  preference  for  a 
known  food  item  when  it  has  not  been 
available  (hence  not  consumed)  for  a 
long  time.     It  is  possible  that  "a"  is 
itself  some  function  of  absolute  time 
reflecting  shifts  of  taste  or  seasonal- 
ity, but  these  effects  are  not  considered 
here . 

(c)  h[atj  +  (l-a)t2]  lah(tj)  +  (l-a)h(t2) 
for  0  <^  a  _<  1  and  0  <  tj  <  t2  < 
which  means  that  the  preference-time 
function  is  assumed  to  be  concave.  The 
evidence  for  this  assumption  is  indirect, 
but  convincing.     The  concavity  of  h(t)  is 
consistent  with  the  observation  that  most 
people  tend  to  separate  their  preferred 
food  items  on  the  time  scale  with  fairly 
equal  time  intervals  as  opposed  to  clus- 
tering them  on  a  succession  of  meals. 

(d)  4"^[h(t)/t]  =  0  at  some  unique  value  of 
at 

t  =  T^,   (0  <  To  <  »)  where  h(t)/t=g(t) 
is  the  preference  function  averaged  over 
time,  and  it  is  postulated  that  this 
time-average  has  a  unique  maximum  at 
time  Tg.     This  property  of  g(t)  is  sup- 
ported by  the  evidence  that  people  can 
estimate  values  of  T^  by  the  ability  of 
responding  to  questions  such  as  "how  fre- 
quently do  you  like  to  eat  this  given 
item?" 

Empirical  verification  of  the  above  assump- 
tions is  available  in  a  report :     Modeling  Food 
Preferences  Over  Time  [6].     It  was  found  that  pre- 
ference for  a  particular  menu  item  can  be  best 
described  by  the  recursive  formula 


(1) 


h(t^)  =  f(tn-tn_i)  -  e     ^^"""^"-l^  [ f  (-) -h (t^.i)  ] 


where  r  >  0  and  t^  indicates  the  absolute  time 
scale  when  the  item  was  consumed    n     times  before. 
Figure  1  shows  the  analytical  form  of  this  recur- 
sive relation  in  terms  of  parameters  estimated  from 
observed  data. 


Here 


(2)     f(t„-t^_p  =  f(t')  =  a-be 


-ct' 


where  a  >  0,  b  >  0,  and  c  >  0  are  parameters  of  a 
first  order  differential  equation  which  postulates 
that  the  rate  of  increase  of  preference  in  time  is 
proportional  to  the  effect  of  "monotony"  expressed 
by  the   [a-f(t')]  difference. 

Substituting  (2)  into   (1)  and  letting 

Lim    h(tj^)  =  h(t) 
nt 

n 


(1)  reduces  to 

(3)    h(t)  =  a 


be 


■ct 


1-e 


-rt 


This  is  the  expression  of  preference-time  rela- 
tions with  the  assumption  that  the  item  is  repeat- 
edly consumed  at  identical    t    time  intervals. 
From  (3)  the  time  averaged  function  of  preference 
is  obtained  as 


(4)    g(t)  =  f 


be 


-ct 


t(l-e  ) 


Figure  2  shows  the  shapes  of  the  f(t),  h(t) 
and  g(t)  functions  for  a  particular  item  as  rated 
by  one  subject.     It  is  seen  that  f(t)  >  h(t)  for 
all  values  of     t     and  g(t)  has  a  unique  maximum  at 
T^  =  5  days. 

The  authors'  earlier  report   [6]  has  shown  the 
methods  and  the  results  of  estimating  the  parame- 
ters of  the  h(t)  function  from  questionnaires. 
It  is  noted,  however,  that  one  of  the  four  parame- 
ters,    r  ,  is  needed  basically  only  for  the  compu- 
tation of  the  recursive  relations  in  (1) .     For  con- 
siderations where  the  time  intervals  can  be  regard- 
ed as  equidistant,  a  direct  analytical  expression 
for  h(t)  can  be  attempted  with  more  economy  in 
parameters.     For  this  reason  an  approximation  of 
h(t)  by  H(t)  is  introduced  in  the  form 


(5)     H(t)  =  a 


ut 


It  is  obvious  that  H(t)  satisfies  conditions  (a) 
through  (d)  stipulated  for  preference-time  func- 
tions.    Moreover,  it  is  possible  to  estimate  the 
parameter  of  H(t)  from  that  of  the  h(t)  function. 
Conditions  (a)  and  (b)  are  clearly  satisfied  at 
the  same  value  of  "a"  for  both  functions.     By  im- 
posing two  additional  conditions,  the  two  other 
parameters  of  the  H(t)  function  can  be  uniquely 
defined.     These  two  conditions  are  as  follows: 

1.  Requiring  that  both  functions  have  identi- 
cal zero  crossing,  i.e. 
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h(t)  =  H(t)  =0  at  t  =  t(. 
This  is  satisfied  if 


(6) 


be 


The  value  of  t^^  can  be  determined  by  solv- 
ing the  implicit  nonlinear  equation 


*(t(,)  =  a(l-e  ""^o) 


be  '^^o  =  0 


by  some  numerical  analytical  method. 

2.  Requiring  that  the  time  averaged  preferences 
have  maximum  at  the  same  t  =        time  for 
both  functions.     As  in  expression  (4),  G(t) 
is  defined  as 


from  data.     It  is  of  some  practical  significance 
that  conventional  food  preference  questionnaires 
are  sufficient  to  estimate  these  two  parameters. 
In  most  of  these  questionnaires  the  subjects  are 
asked  to  indicate  how  frequently  they  want  to  eat  a 
given  item,  and  they  also  have  to  rate  their  pref- 
erences for  the  item  on  some  hedonic  scale.  With 
the  assumption  that  their  preference  rating  is  con- 
ditioned by  the  estimated  time  interval  Tq ,  corre- 
sponding to  their  frequency  rating,  the  preference 
ratings  can  be  regarded  as  direct  estimates  of 
pCTq),  i.e.,  a  point  on  the  preference-time  func- 
tion.    This  assumption,  of  course,  can  be  made  oper- 
ational by  the  appropriate  phrasing  of  the  ques- 
tions . 

With  the  estimated  values  of  pCTq)  ^''^^  "^o  avail- 
able,   (10)  is  one  of  the  conditions,  and 


ut 


(11) 


d_ 
dt 


p(t)  ^ 
t 


-a  + 


ut 


=  0 


Then  the  condition  that 


is  the  other  condition  that  the  parameters  a  and 
u    must  satisfy.     Both  conditions  are  satisfied  if 


iG(t)  =  f^g(t)  =0    at     t  =  T„ 


implies  that 


^  +^  =  0 
,,2  ^v+2 

To 


I.e. 


(8) 


v+1 
ut! 


where  Tq  is  already  determined  in  the  esti- 
mation of  h(t)  and  gCt). 

By  meeting  these  two  conditions,  the  values  of 
parameters    v    and    u    of  the  H(t)  function  are 
determined,  since  expressions  (6)  and  (8)  yield  the 
implicit  form 

1  t„  V 


(9) 


v+1  =(t„) 


which  has  a  unique  solution  for  v  and  by  substi- 
tution into  (6)  and  (8)  the  value  of    u  obtains. 

Figure  3  shows  the  tabulation  of  the  estimated 
parameters  of  the  h(t)  and  H(t)  functions  as  ob- 
tained from  the  ratings  of  two  selected  subjects 
for  an  assortment  of  dessert  items.     The  last  three 
columns  are  obtained  from  the  first  four  columns, 
with  the  exception  of       ,  which  is  the  subjects' 
preferred  time  interval  for  the  items  and  is  part 
of  the  input  data.     It  is  noticeable  that  for  most 
items  the  value  of    v    is  fairly  close  to  one. 


(12) 


2p(To) 


2/aT^ 


The  analytical  properties  of  this  simplified 
preference-time  function  imply  that  the  estimated 
preference  at  t  =  =°  is  twice  as  much  as  the  prefer- 
ence at  T^  time  interval.     It  is  interesting  that 
the  observed  preference-time  data  (Figure  3)  are 
not  too  far  from  satisfying  this  property,  since 
the  estimated  value  of  the  parameter    v  was 
fairly  close  to  one. 

In  the  previous  part  the  preference-time  func- 
tion was  introduced  as  a  measure  of  satisfaction  if 
a  fixed  portion  of  a  food  is  consumed  by  an  indi- 
vidual at  identical  time  intervals  of  length     t  . 
Here  we  consider  a  particular  food  item  (or  menu 
item)  denoted  by  the  subscript     j    ,  and  will  util- 
ize the  H(t)  and  G(t)  functions  to  get  an  expression 
for  the  measure  of  satisfaction  over  the  whole  plan- 
ning horizon  of  a  menu  plan. 

Let    N    be  the  total  number  of  days  included  in 
the  menu  plan.     With  a  serving  time  interval  of  t 
days,  the  number  of  times  menu  item    j     is  offered 
is  denoted  by  Xj .     With  the  standardized  portion 
size  for  which  the  preference-time  functions  are 
evaluated,  the  following  identity  holds 


(13) 


x.  =  N/t 


Consequently,  the  preference  at  each  consumption  as 
a  function  of  the  frequency,  the  number  of  times 
item    j     is  offered  on  the  menu  plan,  is 


(14)  H(xJ 


a.  -  (l/b^)(Xj/N)'j 


For  reasons  of  analytical  simplicity,  the  H(t) 
function  will  be  used  as  the  analytical  model  of 
preference-time  relations  in  the  following  parts  of 
this  paper.  One  can,  however,  further  simplify  the 
function  by  assuming  that  v  =  1  for  most  items. 
This  way  a  two-parameter  approximation  of  the  pref- 
erence-time function  obtains  in  the  form 


(10)       p(t)  =  a 
where  parameters  a 


_  1 

ut 
and  u 


are  to  be  estimated 


The  total  preference  derived  by  offering  menu  item 
j     Xj  times  is  obtained  by  multiplying  H(xj)  by  xj 

to  yield  G(xj) . 

(15)  G(xj) 

or  more  simply 

(16)  G(xj) 


(l/bj)(xj/N)"j 


-  '^j 


V-i+1 
X.J 

J 
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where 


(1/bj)    •  (1/N) 


It  is  readily  seen  that  G(xj)  is  a  unimodal 
function  of  xj ,  implying  that  it  has  a  unique  maxi- 
mum at  some  value  of  Xj .     This  implies  that  indi- 
viduals tend  to  pace  their  consumption  of  food  j 
on  the  time  scale  such  that  the  preferred  number  of 
times  in  the  cycle  length    N    is  given  by  the  maxi- 
mum of  the  G(Xj)  function.     In  other  words,  for 
menu  item    j   ,  G(xj)  is  the  preference  objective 
function  to  be  maximized  by  one  individual.  G(Xj) 
is  a  quadratic  function  if  Vj  =1. 

Until  now,  G(Xj)  was  tacitly  assumed  to  be  the 
preference  function  of  a  given  individual  for  food 
j .       It  is  very  likely  that  different  individuals 
may  have  nonidentical  preferences  for  the  same  set 
of  foods.     Consequently,  the  notation  G^(x^)  will 
be  introduced  as  the  preference  function  of  the 
i-th  individual  of  a  given  population  for  food  item 
j  • 


In  particular 


(17)       Gi(xj)  =  aijXj  -  h^.x. 


will  replace  the  notation  used  in  (16).     The  param- 
eters a-j^j  ,  b^j  and  vij  can  be  established  from 
questionnaires  by  the  earlier  described  methods. 
Clearly,  the  Gj^(xj)  functions  are  concave  for  each 
individual     i  ,  so  one  can  express  the  preference 
function  of  a  set  of  individuals,    M  ,  for  item  j 
as  follows : 

(18)  Gj^(xj)  =     I  Gi(xj) 

ieM 

where  Gj^(x.)  will  have  a  unique  maximum  at  some 
value  of  Xj ,  since  the  sum  of  concave  functions  is 
also  a  concave  function.     Thus,  one  can  say  that 
for  a  given  population  the  preference  realized  from 
food  item    j     is  a  maximum  for  Xj   is  equal  to  x. , 
where 

(19)  Gj^(x°)  =  max    I    G^(x.)  . 

This  is  not  to  say,  however,  that  the  maximum 
of  Gj^(xj)  functions  of  individuals  in  the  popula- 
tion, especially  if  the  population  is  very  hetero- 
geneous with  respect  to  their  preferences  for 
foods.     This  problem  can  be  resolved  by  partition- 
ing the  set  of  individuals  into  subsets,  clusters 
such  that  the  within  cluster  homogeneity  of  indi- 
viduals is  maximum  with  respect  to  the  set  of 
foods  under  consideration.     The  smaller  the  clus- 
ters are,  the  more  clusters  are  needed,  and  the 
o 

cluster  maximum  Gj^(x.)  will  be  closer  and  closer 
to  the  maximums  of  the  G-;^(Xj)  functions  within  the 
clusters.     It  suffices  to  say  here  that  for  any 
number  of  desired  partitionings  of  set    M  ,  tech- 
niques of  cluster  analysis  are  available  to  find 
the  most  homogeneous  set  membership  of  clusters  and 
thus  the  corresponding  values  of  the  Gj^(xj)  func- 
tions for  any  set  of  food  items. 

The  function  Gj^(xj),  as  defined  by  (18),  needs 
3xM  parameters  for  its  evaluation.  Parameter 
reduction  can  be  performed  on  Gj^(xj),  too. 


(xj)  =     I   (a^jXj  -  bijXjiJ 
ieM 

=  ''jt  I  ^ij  -    I  ^ijxj^] 
ieM  ieM 


By  noting  that 

oo 

xjj  =    I  (vj  In  Xj)"/n: 
n=0 

the  expression  for  Givi(xj)  can  be  written  as 

Gi^(xj)  =  Xj[);aij  -  lb±jl(v^  In  Xj)"/n: 
i  in 


Xj[Ia.j  -  Ibij  -  In  xjIbjijVj 


1  1 

,2 


(In  x^) 


21 


i  13  j 


In  (x^)  3 


By  denoting    w    =  J(b..v?/nl) 


n      ^     ij  J 


a.  =  y(a. .-b.  .) 
J      5;    IJ  IJ 


the  above  expression  reduces  to 

oo 

(20)     Gj^(x^)  =  X.  [I.  -  I  (In  X.)"  •  w^] 


n=l 


We  note  that  the  summation  in  the  above  expression 
forms  a  monotonically  decreasing  convergent  series, 
and  hence  an  arbitrarily  accurate  approximation  to 
the  Gj^(x.)  function  can  be  obtained  by  retaining  a 
finite  number  of  terms  in  the  summation.  Although 
no  tests  have  been  performed  to  observe  how  fast 
the  series  converges,  it  seems  that  for  large    M  , 
much  fewer  than  3xM  parameters  need  to  be  used  to 
obtain  an  accurate  approximation  to  the  function. 
It  is  realized  that  computing  w^^  initially  uses  all 
the  original  parameters.     However,  this  need  be 
done  just  once,  and  it  can  be  done  before  the  param- 
eters are  actually  put  to  use  in  the  nonlinear  pro- 
gramming model  described  in  the  next  section. 

Expression  (16)  for  G(xj)  was  obtained  from 
(5) ,  which  was  obtained  from  (3)  to  effect  an  econ- 
omy in  the  number  of  parameters  in  the  function. 
One  can,  of  course,  obtain  the  G(xj)  function  di- 
rectly from  (3) ,  using  all  of  the  four  parameters 
in  it.     The  resulting  function  is 

-c'/xi 


(21) 


G(xj)  =  axj 


be 


where  c'=Nc  and  r'=Mr.     Figure  4  displays  the  form 
of  the  G(xjj)  functions  as  obtained  from  a  Skylab 
astronaut  for  two  entrees. 

Ill    NONLINEAR  PROGRAMMING  SOLUTION  TO  THE  MENU 
PLANNING  PROBLEM 

The  fundamental  problem  of  menu  planning  is  to 
define  which  menu  items  and  how  many  times  should 
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appear  in  a  given  time  period  -  called  a  cycle  -  on 
the  menu.     According  to  the  definition  adopted  here, 
menu  planning  is  a  decision  problem  which  concen- 
trates on  the  whole  cycle,  and  attempts  to  find  an 
optimum  plan  in  terms  of  finding  optimum  frequen- 
cies of  the  items  under  consideration.     In  contrast, 
menu  scheduling  is  defined  as  a  problem  of  deciding 
which  item  should  appear  in  which  meal  and  day. 

Scheduling  and  planning,  of  course,  are  intim- 
ately related  in  the  sense  that  the  time  aggregate 
of  the  menu  schedule  for  a  cycle  is  the  menu  plan. 
In  turn,  a  properly  structured  menu  plan  can  be 
partitioned,  i.e.,  scheduled  into  a  sequence  of 
menus.     The  presentation  of  the  material  will  be 
based  on  this  latter  principle.     For  the  sake  of 
conceptual  clarity,  nonselective  menus  will  be  con- 
sidered first. 

It  is  assumed  that  a  set  of    n    menu  items  is 
subject  to  menu  planning  decision  such  that  for 
each  item    j    ,  optimum  quantities  x • ,    (j =1 , 2 , • • • ,n) 
should  be  determined  for  a  given  set  of  individuals 
and  for  a  menu  cycle  of     s    days.     In  this  context 
the  meaning  of  Xj  is  the  number  of  unit  portions  of 
menu  item    j     to  be  allocated  on  the  menu  during  s 
days.     It  is  assumed  further  that  the  set  of  n 
menu  items  can  be  partitioned  into    K  subsets 
according  to  the  course  structure,  such  as  entrees, 
starch,  vegetable,  etc.  of  the  meals  for  a  day. 
Let  Wj^  (k=l,2,"-",  K)  be  the  relative  weight  of 
course    k    in  the  total  preference  of  the  meals. 
With  these  notations,  the  total  preference  of  a  set 


of  individuals  M 
pressed  as  follows ; 

(22)       G(X)  = 


for  n  menu  items  can  be  ex- 
K  n,^ 

I  "k    I  GmCxj) 


where  nQ=0,  njr=n,   (nj.-  nj^_2)  is  the  number  of  menu 
items  in  course    k  ,  X 
ponents,  and 


is  the  n-vector  of  Xj  com- 


(23)    G^(Xj)  =  lHi-r 

leM 


as  in  (18)  ,  or  as  defined  in  (20)  to  achieve  param- 
eter economy.     Consequently,  G(X)  is  a  weighted 
additive  function  of  nonlinear  expressions  which 
all  depend  on  the  aggregate  preference-quantity 
functions  of  a  population  for  each  of  the    n  items 
involved.     Inasmuch  as  the  objective  of  menu  plan- 
ning is  to  select  menu  items  for  a  cycle  of  s 
days  which  will  be  most  preferred,  this  objective 
can  be  reached  by  finding  the  maximum  of  the  G(X) 
function.     To  insure,  however,  that  the  resulting 
vector    X    is  appropriate  for  the  purposes  of  sched- 
uling as  well  as  from  the  point  of  view  of  other, 
such  as  budgetary,  nutritional  and  compatibility 
considerations,  only  the  constrained  maximization 
of  G(X)  will  provide,  in  general,  acceptable  re- 
sults.    This  leads  to  a  nonlinear  programming 
formulation  of  menu  planning. 

Accordingly,  menu  planning  with  preference 
maximization  objective  can  be  formulated  as  a  non- 
linear program  problem  stated  as  follows: 


c  X  1  c„ 


(24)    max.  G(X)  s.t 

AX>B,  MX<S,  RX<D 


where 

G(X)  is  the  nonlinear  objective  function  iden- 
tical to  (20)  . 

T 

c        is  the  n-vector  of  unit  portion  costs  of 
menu  items. 

A        is  the  mxn  matrix  of  the  nutrient  compo- 
sition of  menu  items,  with  a^^  element 
indicating  the  amount  of  nutrient     i  in 
one  portion  of  menu  item    j  . 

B  is  the  m-vector  of  the  nutrient  allow- 
ances for  some  reference  person  for  s 
days . 

M        is  a  Kxn  incidence  matrix  containing 

staggered  rows  of  unit  coefficients  corre 
spending  to  the  availability  of  the  items 
for  given  courses. 

S        is  a  K-vector  of  components     s     or  2s 
for  nonselective  menus  indicating  the 
number  of  items  needed  for  a  course  for  a 
cycle  of     s  days. 

R        is  an  Lxn  matrix  of  coefficients  for 

assorted  attribute  constraints,  propor- 
tionality constraints,  production  con- 
straints, etc.  which  define  feasibility 
conditions  for  scheduling  the  vector    X  . 

D        is  an  L-vector  defined  by  the  constraints 
above . 

X        is  the  vector  notation  for  the  menu  plan 
which  is  fully  defined  by  the  values  of 
the  components  of    X  .     If  the  j-th  com- 
ponent of    X    in  the  solution  is  not  zero 
Xj  represents  the  number  of  portions  of 
menu  item    j     to  be  allocated  for     s  days 

The  above  definition  of  Xj  and  its  role  with 
respect  to  the  feasibility  of  scheduling  requires 
that  all  the  components  of    X    be  integers. 
Strictly  speaking,  menu  planning  is  a  nonlinear  and 
integer  programming  problem.     Such  problems  are 
still  considered  intractable  in  theory.     In  prac- 
tice, however,  the  problem  is  not  too  serious  be- 
cause of  two  favorable  factors.     First,     s     can  be 
rather  large.     Sixty  day  or  90  day  menu  cycles  are 
common,  so  the  number  of  portions  to  be  represented 
by  the  Xj  components  can  be  large  integers  where 
the  effects  of  rounding  are  relatively  minor. 
Second,  the  nonlinear  programming  problem  posed  in 
(24)  is  well  suited  for  solution  by  piecewise  lin- 
earization and  convex  separable  programming  tech- 
niques where  the  grid  points  of  the  linearized  var- 
iables can  be  conveniently  selected  to  coincide 
with  unit  portions.     This  way  all  the  upper  bounds 
of  the  auxiliary  variables  will  correspond  to  inte- 
ger values  and  experience  with  such  upper  bounded 
linear  programming  models  for  menu  planning  has 
shown  that  in  such  cases  a  sizeable  majority  of  the 
bounds  tend  to  bind,  and  thus  most  of  the  compon- 
ents of  the  X-vector  will  be  integer  valued. 

It  should  be  mentioned  here  that  an  alternate 
nonlinear  programming  formulation  of  menu  planning 
can  be  derived  from  (24)  and  considered  as  useful 
for  institutional  feeding  programs  where  the  man- 
agement objective  is  to  maintain  a  given  food 
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preference  level  aC  minimum  cost.  This  version  of 
the  problem  is  explicitely  stated  here  for  further 
reference : 


(28) 


m+K+L 

1 
i=l 


q.  .X.  +  n'^.-  G„(x.) 


(25)  Min.     c  X      s.t.     G(X)  >^  u^ 

AX>^B,  MX<_S,  RX<_D. 

Here  Uq  is  some  minimal  level  of  preference  to 
be  maintained  while  the  rest  of  the  notations  mean 
the  same  as  in  expression  (24).     This  structure  is 
similar,  but  not  superior,  to  the  "best  buy"  linear 
programming  models  discussed  in  [5] ,  where  the  non- 
linear constraint  is  replaced  by  a  set  of  upper 
bounds . 

The  nonlinear  programming  problem  (24) ,  after 
the  addition  of  slack  and  surplus  variables  can  be 
written  in  the  form 

K 

(26)  max.  G(X)  =  I  Wj^  );  G^(x^) 

j""k-l+l 

s.t.  ^  lij^j^  '^i     i=1.2,''-,  m+K+L 
j 

xj  >  0 

The  above  problem,  although  it  is  nonlinear 
and  large  in  size,   (well  over  50  constraints  and 
400  variables  for  food  service  establishments) ,  is 
amenable  to  efficient  solution  techniques  in  exist- 
ence [13].     The  objective  function  of  problem  (26) 
is  additively  separable,  and  this  makes  the  appli- 
cation of  what  Wolfe  [19]  has  termed  grid-lineari- 
zation particularly  efficient. 

We  recapitulate  briefly  the  features  of  a 
grid-linearization  algorithm  for  the  solution  of 
the  nonlinear  additively  separable  problem  (26) . 


An  initial  set  of  grid  points  {xjj.}  is 
defined  for  each  variable  Xj ,  yielding  the  linear- 
ized program  in  the  variables  ^ j  j- • 


(27) 


K  n. 


k=l    j=nj^_i+l  t 


I  ^jt  GM(^jt) 


,  m+K+L 


•t-     I  ll. .x.^X.^=d.  i=l,2,--- 
V      11  It  It    1  '  ' 

I  X.^  =  1        for  all  j 
t 

X.  .    >  0        for  all  i ,  t 


If  the  initial  number  of  grid  points  for  each 
variable  in  problem  (27)  is  large,  an  acceptable 
approximation  of  the  nonlinear  objective  function 
may  result.     However,  if  fewer  grid  points  are  used, 
new  grid  points  can  be  created  in  the  framework  of 
the  solution  algorithm. 

If,  at  iteration    r  ,  the  linearized  program 
is  solved,  an  optimal  solution  ■f^jj-}  and  simplex 
multipliers  (11  ,  11^)  are  available.     For  any  vari- 
able Xj ,  a  new  grid  point  is  sought  such  that  it 
produces  the  most  negative  reduced  cost  factor. 
This  corresponds  to  solving  the  unconstrained 
problem 


Note  that  if  G[^(xj)  is  concave,  the  above  function 
in  Xj  is  convex  and  the  minimum  is  unique.  The 
minimum  in  (28)  can  be  efficiently  determined  by 
any  appropriate  technique  such  as  the  Method  of 
Golden  Sections   [18].     The  new  grid  point  thus  cre- 
ated can  be  added  to  the  current  set,  and  a  new 
iteration  begins. 

Earlier  convergence  proofs   [10]  required  that 
all  columns  be  retained  from  iteration  to  iteration. 
However,  recently  Murphy  [14]  has  developed  some 
column  dropping  procedures,  although  no  experience 
of  the  efficacy  of  such  procedures  is  cited. 

It  may  be  mentioned  that  the  constraints  in 
problem  (27)  of  the  form  ^  Xjj.  =  1  can  be  handled 

t 

as  generalized  upper  bounds.  Moreover,  the  columns 
generated  can  be  implicitly  stored. 

IV  DISCUSSION 

An  optimum  integer  vector    X     from  solving 
either  (24)  or  (25)  produces  only  the  menu  plan 
which  is  to  be  scheduled  by  some  other  method.  It 
is  important  to  realize  that  the  plan    X  already 
satisfies  the  most  significant  conditions  pertain- 
ing to  population  preferences,  total  cost,  nutri- 
tion, and  other  general  aspects  of  feasibility. 
The  only  remaining  objective  of  scheduling  is  to 
assure  compatibility  among  the  menu  items  within 
meals  and  between  meals.     Compatibility  is  a  prop- 
erty of  interactions  between  items,  and  not  neces- 
sarily a  property  of  the  items  per  se.     Thus,  in 
scheduling  we  deal  with  acceptability  aspects  of 
combinations  of  items,  which  was  not  directly  con- 
sidered in  the  objective  function.     The  model  pro- 
posed thus  separates  the  criteria  of  menu  planning 
and  scheduling  into  two  distinct  optimization  pro- 
cesses for  technical  reasons.     In  the  planning 
phase,  food  preferences  are  maximized  as  separable 
functions  by  the  powerful  method  of  nonlinear  pro- 
gramming.    In  the  scheduling  phase,  compatibility 
is  to  be  maximized  by  techniques  still  in  the 
exploratory  stages.     One  obvious  possibility  is  the 
arrangement  of  items  by  manual  methods.     More  exact 
approaches  are  conceivable  by  algorithms  based  on 
multidimensional  scaling  and  graph  theoretical 
methods  presently  under  investigation  [4,12]. 

For  populations  which  are  heterogeneous  with 
respect  to  their  preferences  for  a  given  set  of 
menu  items,  nonselective  menus  cannot  be  optimal 
[7].     An  improvement  on  the  optimality  can  be 
effected  by  partitioning  the  set  of  individuals  M 
into  subsets  of  individuals  M 
where  t 

1m 


i=l 


(i=l,2,---,  t) 

M,  and  each  of  the     t     subsets  will 


be  more  homogeneous  with  respect  to  preferences 
than  the  set    M    if  the  partitioning  is  done  by 
cluster  analytic  methods  [12].     In  this  case  the 
Gj^_  (X-"-)  function  for  cluster  M^  will  have  its 

optimum  Xj  values  closer  to  the  individuals'  pref- 
erence-quantity function  optimums  than  is  possible 
for  set    M  .     This  is  simply  the  theoretical  inter- 
pretation of  the  rationale  of  offering  selective 
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menus.     In  short,  each  cluster  corresponds  to  a  po- 
tential different  optimum  menu  plan,  hence  a  poten- 
tial need  for  a  choice  on  the  menu  which  is  expect- 
ed to  be  exercised  by  the  individuals  in  the  par- 
ticular cluster.     By  increasing  the  value  of     t  , 
more  homogeneous  clusters  can  be  created,  more 
individual  preferences  will  coincide  with  the 
respective  Gj^  (X-"-)  optimums,  but  also  more  choices 
i 

will  be  needed  and  more  kinds  of  items  are  to  be 
prepared  for  each  meal.     It  is  conceivable  that  t, 
i.e.,  the  potential  number  of  distinct  choices,  has 
some  practical  bound,  and  there  is  evidence  [7] 
that  the  greatest  improvements  in  preferences  occur 
at  low  values  of     t   ,  such  as  t=2  or  t=3 . 


storage  to  solve  30-constraint ,  320-variable  prob- 
lems, where  the  number  of  variables  refers  to  the 
number  before  linearization.     No  passive  storage  is 
used.     Approximately  7000  words  of  storage  are 
problem-size-independent.     Since  integrality  of  the 
solution  is  desired,  only  integer  grid  points  are 
used,  and  the  program  has  the  capacity  to  enforce 
different  upper  bounds  on  the  variables. 

Twenty-constraint,  220-variable  problems  re- 
quire approximately  20  CPU  seconds,  and  20-con- 
straint,  120-variable  problems  take  less  than  10 
CPU  seconds  on  the  CDC  6600.     Furthermore,  if  a 
"good"  starting  solution  is  used,  the  solution 
times  decrease  markedly. 


The  menu  planning  problem  for  a  heterogeneous 
population  is  therefore  pictured  as  the  (joint) 
solution  of     t    nonlinear  programming  problems  with 
objective  functions  &»,  (X^)  corresponding  to  clus- 
ters      ,   (i=l ,  2  ,  •  •  •  ,  t) yielding  optimum  nonidenti- 
cal  menu  plans  X-"-  for  each  cluster .     This  is  to  say 
that  the  planning  problem  associated  with  selective 
menus  is  not  particularly  different  from  the  one 
described  earlier.     The  scheduling  problem  becomes, 
however,  somewhat  complicated. 

Let  us  denote  the  cardinality  of  cluster  by 
m^,  and  the  cardinality  of    M    by  m^,  where 

V  1 
m^  =     I  '^±-     The  consistency  between  menu  plans  X  , 

2  t 

X  ,   • • • ,  X    and  the  corresponding  selective  menu 

schedule  requires  that  if  item    j     is  represented 

..12  t  ^ 

m  the  plans  m  quantities  x^  ,  x_.  ,   '  '  "  ,  x_.  ,  then 

the  time  averaged  marginal  probabilities  of  choos- 
ing these  items  from  the  schedule  in    s    days  must 
be  equal  to  the  marginal  probability  of  choosing 
item  Xj  from  other  items  in  the  given  course  by  the 
total  population  in     s    days,  which  is  equal  -  on 
the  basis  of  cluster  preferences  -  to 

1      V  i 
(29)       p.  =  —    I  m.x^ 
J       sm^   .^^     1  3 

It  is  a  difficult  and  thus  far  unresolved  problem 
to  schedule  selective  menus  which  satisfy  this  cri- 
terion, because  compatibility  between  menu  items 
affects  the  joint  probabilities  of  selections. 
Even  if  condition  (29)  is  satisfied,  the  freedom 
of  choice  provided  by  selective  menus  introduces 
random  variables  in  the  food  service  system,  and 
necessitates  redefining  the  food  cost,  nutritional, 
and  other  constraints  in  probabilistic  terms. 

One  can,  of  course,  avoid  some  of  these  prob- 
lems by  scheduling  each  of  the  X^  option  plans  as 
nonselective  menus,  offering  selectivity  only 
among  the  menus,  but  not  the  items,  and  hoping  that 
on  the  average,  population  cluster       will  prefer 
to  select  the  corresponding  X^^  menus  from  the 
schedule.     Obviously,  more  research  is  needed  on 
these  points. 

V    COMPUTATIONAL  RESULTS 

The  grid-linearizing  algorithm  to  solve  the 
preference-maximizing  menu-planning  problem  was 
coded  in  FORTRAN  for  the  CDC  6600.     The  program 
requires  approximately  25000  words  of  active 


Portions  of  the  solutions  to  two  sample  prob- 
lems are  presented  below.     The  nutrient  and  cost 
attributes  are  from  a  U.S.  Army  Master  Recipe  File. 
The  problems  consist  of  determining  serving  frequen- 
cies of  menu  items  for  a  42-day  evening  meal  cycle. 
Although  six-course  meals  were  planned,  the  display 
contains  frequencies  of  only  the  entrees  under  raw 
food  cost  budget  limits  of  $42.00  and  $35.00  for 
all  the  courses. 

NO.       ITEM  NAME  SERVING  FREQUENCY 

(per  42  days) 

 Budget=$35  Budget=$42 


1 

ROAST  BEEF 

3 

00 

4. 

00 

3 

GRILLED  BEEF  STEAK 

3 

00 

5. 

70 

6 

SWISS  STEAK  W/BROWI>I  ORA^'Y 

1 

00 

2. 

00 

8 

MEAT  LOAF 

2 

00 

2. 

00 

9 

GRILLED  SALISBURY  STEAK 

3 

00 

2. 

00 

11 

SWEDISH  MEATBALLS 

2 

00 

1 

00 

15 

BAKED  HAM 

1 

00 

1. 

00 

22 

GRILLED  SAUSAGE  PATTIES 

1 

00 

0 

23 

BARBECUED  RIBS 

26 

1 

00 

24 

ROAST  VEAL 

1 

00 

1 

00 

26 

BAKED  CHICKEN 

2 

00 

2 

00 

27 

FRIED  CHICKEN 

4 

00 

4 

00 

28 

ROAST  TURKEY 

1 

00 

1 

00 

29 

HOT  TURKEY  SANDWICH 

4 

00 

3 

00 

30 

FRIED  FISH 

1 

00 

30 

36 

FRENCH  FRIED  SHRIMP 

2 

00 

2 

00 

37 

SEAFOOD  PLATTER 

2 

00 

2 

00 

40 

SPAGHETTI  W/MEATBALLS 

4 

00 

4. 

00 

42 

BEEF  STEW 

1 

00 

1 

00 

44 

CHILI  CON  CARNE  W/ BEAMS 

3 

74 

3 

00 

PREFERENCE 

85 

24 

86 

69 

In  both  solutions,  the  use  of  integer  grid  points 
helps  make  the  solution  almost  integer,  with  only 
two  fractional  values  for  the  variables.     We  are 
currently  using  a  search  procedure  to  round  the 
solution  obtained  from  the  nonlinear  program  by 
permitting  fractional  variables  to  change  only  to 
the  next  lower  or  next  higher  integers,  and  non- 
fractional  variables  to  change  by  at  most  one. 
Other  schemes  such  as  in  [1]  are  also  possible. 

Even  with  a  42-day  budget  reduction  of  15%, 
the  solution  is  able  to  exploit  item  substitution 
to  yield  less  than  a  2%  reduction  in  acceptability 
Unlike  conventional  trial-and-error  menu  planning 
procedures,  this  method  prevents  over-reaction  to 
budgetary  and  price  fluctuations. 

An  apparent  shortcoming  of  the  procedure  is 
that  acceptability  and  schedulability  of  the  item 
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frequencies  obtained  from  the  mathematical  program 
depends  on  putting  together  day-to-day  combinations 
of  items  that  are  compatible.     The  nonlinear  pro- 
gram bypasses  the  issue  of  compatibility.  However, 
blatant  compatibility  effects  can  be  incorporated 
in  the  form  of  constraints.     For  example,  if  a 
sandwich  appears  on  the  menu,  there  may  be  no  need 
to  serve  bread  in  addition.     This  situation  can  be 
handled  via  the  constraint 


jeS  jeB 


N 


where    N    is  the  number  of  days  in  the  cycle,  B 
is  the  index  set  of  breads,  and     S     is  the  index 
set  of  items  that  preclude  bread  from  appearing  on 
the  menu.    We  enforce  such  "exclusion"  constraints 
for  several  courses.     Another  type  of  constraint  is 
the  "inclusion"  constraint,  which  enforces  the 
appearance  of  one  item  if  another  appears.  Thus 
Applesauce  can  be  forced  to  appear  as  often  as  Pork 
Chops.     In  our  experience,  once  the  constraint  set 
considers  schedulability ,  the  compatibility  prob- 
lem becomes  easy  to  handle. 


^^f^  SUBIECT 

CODE  - 

S07 

-V  DESSERT 

CODE  - 

5909 

1 H 

9  13  17 

TIME  (T  DAYSJ 


21 


25 


Figure  2. 


The  shape  of  the  f(t),  h(t)  and  g(t) 
functions  for  parameter  values  of 
a=90.67,  b=67.14,  c=0.0884  and 
r=0.3760. 

Source:    Reference  [5], 


Subject/Item 


Codes 

a 

b 

c 

r 

To 

V 

1/u 

SO6/5909 

20 

00 

10 

34 

0 

0336 

1 

3308 

1.0 

1 

14  30 

0 

2074 

SO6/6011 

25 

00 

11 

61 

0 

0374 

0 

2817 

4.0 

1 

0001 

0 

0200 

SO6/5037 

itO 

00 

15 

17 

0 

0334 

0 

J4S0 

5.0 

0 

9590 

0 

0104 

S06/5111 

30 

00 

9 

12 

0 

0359 

0 

0713 

7.0 

1 

0631 

0 

00359 

SO7/5082 

80 

00 

87 

0 

0628 

0 

1848 

7.0 

0 

9089 

0 

00407 

S07/5909 

90 

00 

54 

64 

0 

0624 

0 

2986 

5.0 

0 

9189 

0 

00486 

SO7/5107 

100 

00 

45 

78 

0 

0805 

0 

3442 

3.0 

0 

9288 

0 

00695 

S07/5013 

60 

50 

21 

27 

0 

0451 

0 

0114 

30.0 

1 

4754 

0 

000271 

Figure  3.  Estimated  parameters  of  two  analytical 
models  of  the  preference-time  function 
for  foods. 

Source;     Reference  [6]. 


TIME 

Figure  1.     The  change  of  preference  over  time  when 
the  item  is  consumed  on  davs  9,  16,  19 
and  28,  the  item  having  been  consumed 
9  days  prior  to  day  1.     The  parameters 
of  the  preference-tine  function  used  for 
this  plot  are:     a=100,  b=40,  c=0.05, 
r=0.4. 

Source:     Reference  [7]. 
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4,00  600  12.00  16.00 

QUANTITY(SERVINGS  IN  28  DAYS) 

Figure  4.  Unconstrained  preference-quantity 
functions  of  an  astronaut  for  two 
entrees . 
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Abstract 

Although  mathematical  programming  al- 
gorithms and  related  computer  software 
have  been  in  existence  for  over  twenty- 
five  years,   and  new  methods  are  being  in- 
vented,  revised  and  implemented  at  a  rapid 
pace,    there  have  been  few   (if  any)  con- 
crete suggestions  for  conducting  experi- 
ments to  evaluate  competing  numerical  tech- 
niques.    Yet  the  computer-operations  re- 
search folklore  abounds  with  information 
about  the  reputed  efficiencies  of  various 
programs.     In  an  initial  attempt  at  improv- 
ing this  condition,   we  analyze  the  problem 
from  a  statistical  point  of  view.     The  ap- 
proach is  contingent  on  the  existence  of  a 
well-defined  population  of  test  problems 
from  which  a  statistical  sampling  is  car- 
ried out.     By  measuring  the  relative  per- 
formance of  various  codes  on  the  sample 
problems,   predictions  can  be  made  as  to 
their  relative  performance  on  the  popula- 
tion of  problems  under  consideration.  Fur- 
thermore,  the  significance  of  these  predic- 
tions can  be  measured  in  a  rigorous  way  us- 
ing standard  statistical  procedures.     As  a 
demonstration  of  our  methodology  we  con- 
sider in  detail  a  comparison  of  a  primal- 
simplex  network  code  with  one  based  on  the 
out-of -kilter  method.     We  show  how  one  may 
define,   in  a  precise  manner,   a  population 
of  test  problems  and  what  conclusions  may 
be  drawn  from  a  simple  random  sampling 
procedure . 

I.  Introduction 

Since  the  inception  of  the  simplex 
method  for  linear  programming  thirty  years 
ago  and  the  simultaneous  development  of 
computers,   researchers  have  been  concerned 
with  analyses  and  comparisons  of  mathe- 
matical programming  techniques.     At  first, 
variants  of  the  simplex  method  such  as  the 
revised  simplex  method,   and  product  form 
of  the  inverse,  were  proposed  as  alterna- 
tives to  the  original  simplex  design  and 
later  as  internal  routines  for  a  variety  of 
nonlinear  and  combinatorial  programming 
techniques.     One  of  the  first  empirical  an- 
alysis of  these  proposals  was  by  Wolfe  and 
Cutler    [1963].     A  plethora  of  computational 


studies  have  followed    (cf.   Kuhn  and  Quandt 
[1963],   Florian  and  Klein    [1970],  Gilsinn 
and  Witzgall    [1973]  ,   Srinivasan  and  Thomp- 
son   [1973] ,    Zanakis    [1973]   and  Barr  et  al 
[1974] ) .     Paralleling  these  studies  have 
been  a  host  of  informal  unpublished  exper- 
iences from  which  a  substantial  folklore 
about  the  relative  effectiveness  of  vari- 
ous techniques  has  arisen;   despite  these 
studies,   there  is  still  little  agreement 
today.     Since  the  out-of -kilter  network  al' 
gorithm   (a  primal-dual  approach)   was  pub- 
lished in  Ford  and  Fulkerson    [1962],  for 
example,   a  debate  has  raged  over  the  rela- 
tive superiority  of  this  method  versus  the 
network-specialized  primal  simplex  methods 
(see  Dantzig    [1963] ,   Aashtiani  [1976], 
Klingman  et  al   [1974],  and  Hatch  [1975]). 


There  are  several  underlying  causes  for 
this  lack  of  agreement:      (1)    the  absence 
of  a  graded  set  of  standard  test  problems, 
(2)   uncertainties  about  the  efficiencies 
of  different  computers,    (3)   incomplete  des- 
criptions of  the  experimental  design  vari- 
ables when  reporting  computational  exper- 
ience,  and    (4)    a  lack  of  guidelines  for 
performing  computational  experiments. 

In  other  areas  of  mathematical  program- 
ming such  as  nonlinear  programming,  at- 
tempts have  been  made  to  compare  algor- 
ithms,  see  for  example  Colville   [1970]  , 
Dembo    [1975]   and  Rijckaert    [1975].  Here, 
it  is  much  more  difficult  to  come  to  any 
firm  conclusions  regarding  the  relative 
performance  of  the  particular  codes  in 
question  than  in  the  case  of    (say)  network 
algorithms.     Factors  such  as  internal  tol- 
erance settings,   accuracy  of  the  solution 
and  whether  or  not  a  code  actually  com- 
puted a  Kuhn-Tucker  point,   all  play  a  cru- 
cial role  in  evaluating  the  behavior  of  a 
coded  alborithm. 

There  is  no  question  as  to  the  need 
for  evaluating  the  relative  performance  of 
coded  algorithms.     From  a  theoretical 
point  of  view,   algorithms  are  often  evalu- 
ated on  a  "worst  case"  basis.     This  type 
of  analysis  is  often  misleading  in  a  coded 
form  of  the  algorithm.     For  example,   it  is 
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widely  accepted  in  the  mathematical  pro- 
gramming literature  that  Kelly's  cutting 
plane  algorithm   [1960]   is  not  a  good 
method  for  solving  convex  programming  prob- 
lems in  the  sense  of  convergence  rates  and 
numerical  stability.     It  is  probably  be- 
cause of  this  folklore  that  one  hardly  ever 
hears  of  nonlinear  codes  based  on  Kelly's 
cutting  plane  algorithm.     It  is  quite  con- 
ceivable,  however,   that  such  a  method  might 
provide  a  basis  for  a  relatively  efficient 
and  robust  code  for  certain  useful  classes 
of  mathematical  programming  problems.  This 
has  actually  been  demonstrated  recently  for 
small-  to  medium-sized  geometric  program- 
ming problems.     In  two  independently  con- 
ducted comparative  GP   studies    (Dembo  [1975] 
Rijckaert    [1976]),   a  cutting  plane  algor- 
ithm was  shown  to  be  one  of  the  most  effic- 
ientl  and  robust^  codes  tested. 

One  of  the  major  obstacles  in  conduct- 
ing computational  analyses  involves  concep- 
tual differences  between  mathematical  al- 
gorithms,  the  computer  software  and  the  de- 
tailed,  problem-specific  tactics  which  oc- 
cur when  empirical  results  are  collected. 
Figure  1  graphically  depicts  the  situation. 


MATHEMATICAL 
ALGORITHM 


(level  1)    there  are  a  host  of  computer 
software  implementations    (level  2)  with 
widely  varying  degrees  of  effectiveness. 
These  implementations  range  from  under- 
graduate student  LP  projects  to  IBM'sMPSX 
system.     It  is  important  to  note  that  a 
basic,   and  distinguishing  property  of  level 
2  is  the  underlying  information  structure. 

For  each  computer  software  implementa- 
tion, there  are  usually  many  control  set- 
tings which  define  internal  tactics  that 
the  program  utilizes  in  solving  a  particu- 
lar problem  or  a  problem-class.     These  con- 
trol settings  can  result  in  drastic  dif- 
ferences in  efficiency,   especially  for 
large-scale  mathematical  programs.     To  il- 
lustrate the  variation  which  can  occur,  a 
single  assignment     network  with  10,000 
nodes    (constraints)   and  30,000  arcs  (vari- 
ables)  was  solved    (see  Mulvey    [1975])  with 
three  different  pivot  strategies,   that  is, 
the  procedure  for  selecting  which  eligible 
non-basic  variable  enters  the  basic  at  each 
pivot.     The  computational  results  are  as 
follows : 


Pivot  Strategy 
(1) 
(2) 
(3) 


Seconds" 


Pivots 


3076 

472 

,999 

3028 

28 

,113 

971 

136 

,204 

T  1  1  1  \ 


'f TLEMENTATION 

i  


1 

1           '  1 

!.LEVEL_3j 

INTERNAL 
TACTICS 

empirical 

Figure  I 
Concept,  ial  Franiework 

Mathematical  algorithms  occur  at  the  high- 
est level  of  abstraction   (level  1) ;  follow- 
ing Zangwill   [1969],  we  define  algorithm  as 
a  sequence  of  point-to  set  mappings  from 
which  theoretical  results  can  be  derived  - 
for  example,   infinite  convergence  proper- 
ties,  or  algorithmic  efficiency  as  defined 
by  the  number  of  iterations  in  a  worse 
case  analysis.     Moving  down  to  the  next 
level  of  detail,  we  encounter  computer 
software.     For  each  mathematical  algorithm 


In  terms  of  standardized  central  process- 
ing time. 

2 

In  terms  of  the  number  of  problems  for 
which  the  code  actually  computed  a  correct 
solution . 


A  similar  phenomena  occurs  in  nonlinear 
programs  where  a  small  change  in  the  tol- 
erances causes  large  shifts  in  computa- 
tional results    (see  for  example  Dembo 
[1975] )  . 

Admittedly,   the  elements  at  each  level 
(mathematical  algorithm,   software  implemen- 
tation,  internal  tactics)    cannot  be  pre- 
cisely defined  and  universally  accepted. 
To  some,   a  minor  change  in  internal  tactics 
is  really  a  change  in  the  fundamental  math- 
ematical algorithm.     To  avoid  these  diffi- 
cult issues,  we  took  a  different  approach 
for  developing  a  framework  with  which  we 
could  analyze  and  compare  mathematical  al- 
gorithms and  related  software. 

In  the  next  section,  we  review  the 
various  types  of  test  problems  that  may  be 
used  in  numerical  comparisons,   and  Section 
3  provides  the  above  mentioned  framework  by 
concentrating  on  well-defined  collections 
of  test  problems.     A  statistical  analysis 
is  then  undertaken  in  Section  4.     In  these 
experiments,   it  is  not  our  primary  inten- 
tion to  exhaustively  test  specific  tech- 
niques,  but  to  show  how  these  comparisons 
can  be  conducted  in  light  of  a  statistical 
analysis . 


IBM  370/155  seconds,   Fortran  G  compiler. 
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2,     Test  Problems 

There  are  two  main  categories  of  prob- 
lems that  are  currently  used  for  reporting 
numerical  results  in  the  mathematical  pro- 
gramming literature.     Problems  are  either 
hand-selected  or  are  randomly  generated 
using  a  pseudo-random  number  generator.  In 
both  cases  one  encounters  serious  draw- 
backs when  attempting  a  scientific  analy- 
sis of  computer  codes  used  to  solve  a  par- 
ticular class  of  problems. 

On  the  one  hand,    the  behavior  of  com- 
puter codes  on  randomly  generated  problems 
does  not  in  general  reflect  the  behavior 
of  these  same  codes  on  similar  sized  real 
world  problems.    This  may  be  due  in  part  to 
the  fact  that  in  real  world  problems  there 
is  usually  some  degree  of  correlation 
among  variables.     Correlation  may  be  in- 
corporated in  randomly  generated  problems 
to  accurately  model  real  world  behavior; 
however,   both  the  nature  of  the  applica- 
tion and  the  degree  of  correlation  among 
variables  would  have  to  be  known  and  thus 
the  problem  generator  would  have  to  vary 
from  application  to  application. 

On  the  other  hand,    if  used  for  com- 
parative purposes,   computational  results 
obtained  from  problems  that  have  been 
selected  from  models  of  real-world  situa- 
tions make  a  statistical  analysis  of  re- 
sults questionable.      In  this  case,  conclu- 
sive statements  regarding  the  performance 
of  these  codes  can  only  be  made  for  the 
particular  problem  set  under  considera- 
tion and  any  generalizations  must  be 
treated  with  suspicion. 

The  relative  merits  of  the  above 
categories  of  test  problems  are  summa- 
rized below: 


Hand'-Se lected  Frobiems 

Usually  representative 
or  real-vv-orld  behavior. 

Expensive  to  collect, 
document  and  send  from 
one  researcher  to 
another. 

Population  of  problems 
from  which  sample  pro- 
blems are  drawn  is  not 
known.  Thus,  general- 
izations based  on  the 
sample  are  question- 
able. 


Randomly  Generated  Problems 

Usually  not  representative 
of  real-v;orlc  behavior. 

Problem  generators  can  be 
designed  to  be  portable 
and  machine  independent. 


Population  of  problems  is 
known  and  can  be  control led. 
If  sampling  method  is  known, 
generalizations  based  on 
sample  statistics  can  be 
made  with  a  known  degree  of 
certainty . 


In  this  Study  we  will  restrict  our- 
selves to  pseudo-randomly  generated  prob- 
lems.    We  have  chosen  to  do  so  because  in- 
ferences can  be  made  about  populations  of 
test  problems  in  a  precise  manner. 

Since  this  is  an  initial  attempt  in  a 
field  that  is  relatively  untouched,  we 
have  decided  to  restrict  our  attention  to 


comparing  the  performance  of  two  network 
codes  on  a  well-defined  population  of 
assignment  problems.     Part  of  our  aim  will 
be  to  set  out  on  defining  a  standard  format 
for  researchers  to  follow  when  reporting 
results  on  the  behavior  of  coded 
algorithms . 

The  two  codes  considered  in  the  next 
section  are: 

KILTER  :  A  network  code  based  on  an  out- 
of-kilter  algorithm  and  written 
by  Aashtiani  [1976] 

and 

LPNET   :     A  network  code  based  on  the 

primal  simplex  method  and  writ- 
ten by  Mulvey    [1975] . 

There  are  a  number  of  reasons  for 
starting  our  analysis  with  network  codes  as 
applied  to  the  solution  of  assignment  prob- 
lems.    First,   network  calculations  involve 
manipulation  of  integers  and  therefore  tol- 
erances which  are  difficult  to  standardize 
and  which  play  such  an  important  role  in 
comparing  nonlinear  programming  algorithms, 
can  be  avoided.     Secondly,   a  widely  used 
pseudo-random  problem  generator,  NETGEN 
(Klingman,   Napier,   Stutz    [1974]),   is  avail- 
able for  constructing  feasible  network 
problems.     Finally,   as  previously  des- 
cribed,  the  recent  literature  on  network 
codes  has  been  filled  with  controversy  as 
to  whether  either  the  out-of -kilter  or  the 
primal  simplex  method  are  the  best  ap- 
proaches to  solving  assignment  problems. 

3.     Notation  and  Methodology 

The  primary  aim  of  computational  com- 
parisons is  to  make  inferences  about  the 
relative  behavior  of  the  various  algor- 
ithms under  consideration.     It  is  widely 
recognized  that  such  comparisons  cannot 
lead  to  any  hard  conclusions  regarding  al- 
gorithms themselves;   rather,   one  can  only 
derive  information  on  the  relative  per- 
formance of  the  software  implementations  of 
these  algorithms.     What  is  not  realized  in 
most  cases  is  that  the  problem  of  comparing 
software  performance  is  a  statistical  one. 
Namely,   the  behavior  of  a  number  of  comput- 
er codes  on  a  specially  selected  set  of 
problems  is  measured  in  terms  of  certain 
performance  indicators  and  from  this  data 
inferences  are  made  about  the  behavior  of 
these  codes  on  a  larger  class  of  problems. 
This  is  clearly  a  case  of  statistical 
sampling  and  in  order  for  these  inferences 
to  have  any  firm  basis,   an  experimental  de- 
sign should  be  carefully  thought  out  with  a 
view  to  the  nature  of  the  inferences  that 
are  to  be  made.     Such  a  decision  must 

(i)      identify  the  population  from  which 
sampling  is  to  take  place, 

(ii)     describe  the  statistical  sampling 
method,  and 
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(iii)     state  the  hypotheses  that  are  to 
be  tested. 

If  the  above  three  factors  are  care- 
fully developed,   then  once  the  computer 
runs  have  been  made  and  the  appropriate 
variable  measured,   inferences  can  be  drawn 
as  to  the  relative  behavior  of  these  codes 
on  the  population  of  problems  identified 
in   (i)*.     Furthermore,   the  significance  of 
these  inferences  can  be  accurately  meas- 
ured using  well  known  statistical  tech- 
niques.    In  order  to  demonstrate  the  meth- 
odology, we  perform,   in  detail,   a  statis- 
tical comparison  of  the  network  codes 
KILTER    (Aashtiani    [1976])    and  LPNET  (Mul- 
vey   [1976] ) . 

In  order  to  describe  a  general  frame- 
work we  need  to  define  the  following  sets. 

P  is  the  set  of  all  problems   that  the 
mathematiaal  algorithm  upon  which  the 
code  is  based,    is   theoretically  cap- 
able of  solving. 

For  example,   P (KILTER)    is  the  collection 
of  all  transshipment  problems  solvable  by 
the  out-of-kilter  algorithm  and  P (LPNET) 
is  the  set  of  all  transshipment  problems 
solvable  by  network  specialized  primal  sim- 
plex algorithms. 

Pq   is   the  set  of  all  problems  for 
which  the  particular  code  was  designed. 

For  example,   P^ (KILTER)    is  the  set  of  all 
transshipment  problems  with  less  than  2000 
arcs  and  500  nodes  and  P(.  (LPNET)    is  the 
set  of  all  transshipment  problems  with 
less  than  2000  arcs  and  500  nodes. 

Pj.   is   the  population  of  test  problems 
and  is  a  subset  of   ^    P^-,-}   where  I 

T 

i  £.1 

is   the  set  of  codes  under  znvesttga- 
tion. 

In  our  case,   P^  is  taken  to  be  the 
set  of  all  feasible  assignment  problems 
as  generated  by  NETGEN    (Klingman,  Napier 
Stutz    [1974]),   with  the  following 
characteristics : 

.   number  of  nodes  between  200  and 
500, 

.   number  of  arcs  between  1000  and 
200,  and 

.  range  of  cost  coefficients  between 
a  lower  bound  of  1  and  an  upper 
bound  greater  than  100  but  not  ex- 
ceeding 5000. 


Here,   P^  was  chosen  because  of  the  physical 
limitations  of  the  computer  used,  our 
self-imposed  restriction  that  the  runs 
should  be  carried  out  internally,     and  the 
desire  to  compare  these  codes  with  respect 
to  sparse,   small-scale  assignment  problems. 
Table  1  below  gives  the  details  of  the 
storage    (high-speed  memory)  requirements 
and  other  important  factors  that  are  re- 
quired for  reproducability  of  our  experi- 
ment . 


Internal  Memory 
Requirements 

Language 

Computer 

Compiler 

Precision 


Table  1 
CODE  SPECIFICATIONS 

P (KILTER) 

37K  Words 
Standard  Fortran 
DEC  1070  (HBS-') 
Fortran-F40 
Integer 


P (LPNET) 

16K  Words 
Standard  Fortran 
DEC  1070  (HBS) 
Fortran-F40 
Integer 


A  typical 
the  following. 


sampling  procedure  would  be 
Draw  a  simple  random  sam- 
ple, p,  of  n  problems  from  the  test  prob- 
lem population  P^ .     The  size  of  n  is 
chosen  to  be  large  enough  so  as  to  ensure, 
for  example,   that  the  distribution  of 
sample  means    (of  the  particular  perform- 
ance measure,   e.g.,   run  time)    is  normal. 
Methods  for  choosing  n  are  described  in 
any  elementary  statistics  text.     A  decid- 
ing factor  in  the  choice  of  n  might  be  a 
limit  on  the  acceptable  probability  of 
rejecting  an  hypothesis  that  is  actually 
true   (Type  I  error) . 

In  our  case  the  simple  random  sample 
was  chosen  with  the  aid  of  NETGEN,  a 
feasible  pseudo-random  network  problem 
generator  developed  by  Klingman,  Napier 
and  Stutz    [1974]^.      This  choice  was  pri- 
marily   guided  by  a  desire  to  make  our 
experiment   easily  reproducable ;  NETGEN 
serves    this  purpose  since  it  may  be 
readily   obtained  from  its  authors  and  is 
already   widely  used.     Details  of  our  sim- 
ple   random  sampling  procedure  are  given 
below. 


All  testing  was  performed  at  Harvard  Uni- 
versity on  the  DEC  1070  computer  with  a 
maximum  of  64K  words  of  high-speed  memory. 

2 

Without  resorting  to  auxiliary  memory. 

■^Harvard  Business  School. 
4 

The  authors  are  willing  to  perform  iden- 
tical experiments  with  other  generators 
provided  the  I/O  formats  conform  to  the 
SHARE  convention  and  the  generating  pro- 
gram can  be  successfully  compiled  on  the 
DEC  1070  computer    (FORTRAN) . 


The  nature  of  the  population  and  exactly 
what  it  represents  is  not  addressed  here 
and  is  a  subject  for  future  research. 
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Procedure  for  Selecting  a  Simple  Random 
Sample  of  Test  Problems  from  the 
Population  P-|- 


Step  1 .     Generate  a  random  integer,   d,  in 
the  range  100  to  5000.     This  fixes  the 
range  of  cost  coefficients  to  the 
range  1  to  d. 

Step  2.     Generate  a  random  integer, 
between  1000  and  2000.     This  fixes 
the  number  of  arcs    (variables)   to  n^. 

Step  3.     Generate  a  random  integer,  m, 
between  200  and  500.     This  fixes  the 
number  of  nodes   (constraints) . 

Step  4 .     Given  the  problem  parameters 
specified  in  Steps  1,   2  and  3  use 
NETGEN  to  generate  a  single  feasible 
assignment  problem  with  n-^  variables 
(Step  2)   and  m  constraints    (Step  3) 
and  cost  coefficients  randomly  selected 
from  a  uniform  distribution  in  the 
range  1  to  d. 

Table  2 


PROBLEM  SPECIPICATIONS  FOR  FIRST  FIFTY  TEST  CASES 


Random 

Cost  Co- 

Nujnbcr 

OBS 

9  Nodes 

9  Arcs 

efficient 

Density 

Seed 

1 

284 

1426 

2176 

.071 

284 

2 

422 

1376 

4394 

.031 

422 

3 

466 

1160 

4930 

.021 

466 

4 

488 

1776 

4086 

.030 

488 

5 

234 

1985 

1041 

.145 

234 

6 

430 

1447 

2859 

.031 

430 

/ 

474 

1082 

1136 

.019 

474 

8 

434 

1217 

539 

.026 

434 

9. 

304 

1628 

2144 

.070 

304 

10 

486 

1080 

2081 

.018 

486 

U 

324 

1669 

4206 

.064 

324 

12 

462 

1395 

2553 

.026 

462 

13 

398 

1353 

2348 

.934 

398 

14 

4S6 

1160 

150 

.022 

456 

Is 

268 

1188 

4421 

.066 

268 

16 

416 

1813 

689 

.042 

416 

17 

280 

1579 

3567 

.081 

280 

18 

282 

1663 

3185 

.084 

282 

19 

300 

1775 

1653 

.079 

300 

20 

396 

1755 

3987 

.045 

396 

21 

438 

nil 

4933 

.023 

438 

22 

296 

1504 

4444 

.069 

296 

23 

240 

1722 

2066 

.120 

240 

24 

496 

1609 

1830 

.026 

496 

25 

494 

1897 

2708 

.031 

494 

26 

268 

1534 

4235 

.085 

268 

27 

296 

1870 

3765 

.085 

296 

28 

340 

1927 

2603 

.067 

340 

29 

340 

1731 

3087 

.060 

340 

30 

348 

1348 

4653 

.045 

348 

31 

402 

1880 

4447 

.047 

402 

32 

242 

U87 

2594 

.081 

242 

33. 

378 

1672 

1659 

.047 

378 

34 

216 

1706 

4520 

.146 

216 

35 

382 

1497 

2771 

.041 

382 

36 

290 

1309 

233 

.062 

290 

37 

494 

1944 

3865 

.032 

494 

38 

460 

1097 

3592 

.021 

460 

39 

482 

1541 

2837 

.027 

482 

40 

232 

1260 

3473 

.094 

232 

41 

454 

1951 

2522 

.038 

454 

42 

224 

1493 

3985 

.119 

224 

43 

386 

1560 

891 

.042 

386 

44 

364 

1448 

4823 

.044 

364 

45 

380 

1853 

4137 

.051 

380 

46 

242 

1471 

4809 

.100 

424 

47 

430 

1836 

1689 

.040 

430 

48 

202 

1466 

4176 

.144 

202 

49 

304 

1157 

1746 

.050 

304 

50 

278 

1240 

1980 

.064 

278 

Step  5.     Repeat  steps  1,   2,   3  and  4  until 
the  desired  sample  size  has  been 
reached . 

We  should  note  here  that  we  assume 
that  the  number  of  arcs  and  the  number  of 
nodes  are  uniformly  distributed  in  the 
ranges  1000  to  2000  and  200  to  500  re- 
spectively . 

In  order  to  perform  the  statistical 
analysis  in  Section  4,  we  selected  two 
independent  simple  random  samples  each 
containing  50  problems.     The  problem  sets 
are  all  the  necessary  information  re- 
quired to  reproduce  them  are  given  in 
Tables  2  and  3  below.     The  optimal  solu- 
tions as  well  as  the  CPU  run  times 
required  by  LPNET  and  KILTER  are  given  in 
Tables  4  and  5. 

In  the  following  section  we  perform 
a  statistical  analysis  of  the  results. 


Table  3 

PROBLEM  SPECIFICATIONS  FOR  SECOND  FIFTY  CASES 


Majclmum 

Random 

Cost  Co- 

Number 

OBS 

(  Nodes 

t  Arcs 

efficient 

DeasltT 

Seed 

51 

442 

1716 

4200 

.035 

442 

52 

208 

1611 

2578 

.149 

208 

53 

420 

1293 

1063 

.029 

420 

54 

358 

1428 

4233 

.045 

358 

55 

266 

1965 

3811 

.111 

266 

56 

402 

1673 

1601 

.041 

402 

57 

466 

1033 

666 

.019 

466 

58 

402 

1993 

860 

.049 

402 

59 

342 

1063 

2493 

.036 

342 

60 

348 

1598 

2345 

.053 

348 

61 

462 

1966 

792 

.037 

462 

62 

362 

1769 

3152 

.054 

362 

63 

478 

1800 

3412 

.032 

478 

64 

216 

1689 

1176 

.145 

216 

65 

352 

1734 

4280 

.053 

352 

66 

294 

1283 

3546 

.059 

294 

67 

258 

1479 

1279 

.089 

258 

63 

234 

1747 

1244 

.128 

234 

69 

356 

1469 

1999 

.046 

356 

70 

292 

1798 

3520 

.084 

292 

71 

330 

1180 

1450 

.043 

330 

72 

274 

1076 

1769 

.057 

274 

73 

422 

1464 

2241 

.033 

422 

74 

452 

1168 

1704 

.023 

452 

75 

326 

1530 

3003 

.058 

326 

76 

236 

1676 

4062 

.120 

236 

77 

450 

1267 

2074 

.025 

450 

78 

356 

1813 

3906 

.057 

356 

79 

246 

1450 

122 

.096 

246 

80 

238 

1225 

2545 

.087 

238 

81 

356 

1102 

3719 

.035 

356 

82 

216 

1205 

1239 

.103 

216 

83 

488 

1536 

3201 

.026 

488 

84 

468 

1863 

128 

.034 

468 

85 

244 

U19 

1751 

.075 

244 

86 

454 

1550 

4512 

.030 

454 

87 

308 

1537 

4549 

.065 

308 

88 

486 

1499 

4375 

.025 

486 

89 

442 

1698 

1476 

.035 

442 

90 

232 

1415 

2596 

.105 

232 

91 

250 

1307 

3163 

.084 

250 

92 

228 

1258 

4336 

.097 

228 

93 

372 

1363 

3278 

.039 

372 

94 

256 

1468 

181 

.090 

256 

95 

292 

1508 

734 

.071 

292 

96 

496 

1583 

4176 

.026 

496 

97 

420 

U14 

3781 

.025 

420 

98 

398 

1020 

3388 

.026 

398 

99 

382 

1207 

3049 

.033 

382 

100 

424 

1657 

4445 

.037 

424 
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Table  4 

RUN  TIME  RESULTS  FOR  FIRST  50  PROBLEl'lS 
(DEC  1070  seconds) 

Optimal  Objective 


OBS 

KILTLR 

T  Tl'fcTT7T' 

LrNET 

Function  Value 

1 

10.700 

3.5?00 

54393 

2 

15.900 

6.800 

232241 

3 

15.300 

7.100 

354043 

4 

17.600 

9.400 

235820 

5 

11.700 

4.600 

17057 

6 

13.900 

6.400 

141765 

7 

7.200 

4.500 

87847 

8 

16.100 

6.500 

36739 

9 

11.600 

4.100 

59286 

10 

13.600 

6.900 

171024 

11 

16.300 

5.200 

132492 

12 

20.600 

6.300 

175529 

13 

1.400 

5.700 

117919 

14 

8.700 

6.000 

10228 

15 

8.100 

3.100 

128061 

16 

23.200 

7.100 

30992 

17 

12.100 

3.900 

89323 

18 

16.000 

3.900 

77548 

19 

13.000 

5.100 

43025 

20 

19.500 

7.100 

154219 

21 

10.200 

4.200 

329131 

22 

13.200 

4.600 

109068 

23 

12.300 

3.100 

37046 

24 

24.300 

7.900 

118056 

25 

21.700 

9.200 

143626 

26 

9.300 

3.800 

97023 

27 

17.700 

5.100 

90408 

28 

18.900 

5.100 

18911 

29 

13.400 

5.000 

94035 

30 

14.300 

4.100 

190304 

31 

21.200 

8.000 

175347 

32 

6.900 

2.300 

56866 

33 

15.300 

7.700 

67607 

34 

14.600 

3.300 

69186 

35 

16.100 

4.900 

108440 

36 

5.200 

4.900 

6944 

3/ 

ZD. 200 

7 . 500 

213927 

38 

10.600 

5.400 

260746 

39 

19.300 

7.100 

160290 

40 

6.300 

3.300 

75499 

41 

27.100 

8.800 

123985 

42 

12,600 

3.100 

70750 

43 

18.700 

6.400 

38006 

44 

12.800 

5.600 

190868 

45 

21.800 

5.500 

158790 

46 

13.000 

3.600 

110732 

47 

22.700 

7.300 

81686 

48 

8.800 

3.000 

39916 

49 

10.200 

4.100 

64907 

50 

10.700 

4.100 

57996 

Table  5 


RUN  TIME  RESULTS  FOR  SECOND  FIFTY  PROBLEMS 
(DEC  1070  seconds) 


LPNET  Time 

Optimal 
Objective 
Function 
Value 

51 

6.950 

218462 

52 

3.940 

37489 

53 

6.500 

61535 

54 

4.260 

168882 

55 

3.690 

72642 

56 

7.890 

67737 

57 

5.700 

55139 

58 

7.250 

33258 

59 

4.090 

118324 

60 

6.060 

86894 

61 

7.800 

41309 

62 

6.470 

103369 

63 

8.220 

188851 

64 

4.170 

17434 

65 

5.990 

144286 

66 

3.390 

99763 

67 

3.400 

32124 

68 

4.000 

23338 

69 

5.810 

83422 

70 

5.040 

85787 

71 

4.600 

66249 

72 

3.470 

57980 

73 

7.470 

109052 

74 

4.670 

115475 

75 

5.950 

97911 

76 

4.260 

75170 

77 

6.520 

130922 

78 

7.000 

131819 

79 

3.600 

2806 

80 

3.810 

56397 

81 

4.100 

172230 

82 

2.960 

23385 

83 

8.600 

208613 

84 

8.680 

6732 

85 

3.320 

46686 

86 

8.710 

253162 

87 

4.310 

139037 

88 

7.860 

30102 

89 

8.250 

80863 

90 

3.500 

52679 

91 

3.100 

72179 

92 

3.640 

89093 

93 

5.590 

148160 

94 

3.310 

4304 

95 

5.110 

19390 

96 

7.200 

263007 

97 

5.400 

226245 

98 

4.570 

202128 

99 

5.650 

153775 

00 

7.140 

199224 
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4.     A  Statistical  Comparison  of  Assign- 
ment Codes 

Summary  statistics  for  the  sample 
sets  are  given  in  Tables  6  and  7. 


Table  6 

SUMMARY  STATISTICS  FOR  FIRST  FIFTY  PROBLEMS 


Mln 

Max 

Mean 

Std.Dev. 

Nodes 

202 

496 

360.04 

89.98 

Arcs 

1080 

1985 

1526.96 

267.  5 

Cost 

150 

4933 

2944.36 

1438 

Density 

.018 

.146 

.058 

.0337 

Table  7 

SUMMARY  STATISTICS  FOR  SECOND  FIFTY  PROBLEMS 


Min 

Max 

Mean 

Std.Dev. 

Nodes 

208 

496 

350.2 

88.56 

Arcs 

1080 

1993 

1479. 3 

264.7 

Cost 

122 

4549 

2584. 06 

1330 

Density 

.019 

.149 

.059 

.0337 

Notice  the  relatively  large  standard  dev- 
iations for  the  number  of  nodes  and  arcs, 
and  the  cost  range;  this  is  due  to  the 
usage  of  a  uniform  density  function  in 
generating  problems.     Since  the  problems 
within  p  are  "small-scale"  examples,  we 
wanted  the  characteristics  of  the  prob- 
lems within  p  to  be  selected  with  equal 
likelihood.     For  this  reason  a  uniform 
density  function  was  employed. 

Turning  to  the  computational  results 
displayed  in  Table  4,   it  is  interesting 
to  observe  that,  although  KILTER  is  fully 
2.7  times  slower  than  LPNET  on  the  aver- 
age, there  is  some  variability.  Specif- 
ically, KILTER  is  only  1.45  times  slower 
than  LPNET  for  problem  #14,  and  a  similar 
occurance  is  demonstrated  for  problem  #36. 


Tabu  • 

SAMPLK  CORRELATKm  C0EmCIgrr9  fOR  PAIUWETERS 
(PBOBLfMS  I'SO) 


Modal 

Urea 

Coat 

Danalty 

Kiltar 

LPMXT 

1 

-  .0330 

-.1S16 

-.9022 

.9400 

.1333 

Aros 

1 

.037S 

.2<97 

.(713 

.3<21 

Cost 

1 

.157! 

.0409 

-.1472 

Daasity 

1 

-.3377 

-.«6ao 

Ran-TlOM-UItsr 

1 

.7286 

lun-Tllu-LPHEt 

1 

It  is  interesting  to  observe  that 
LPNET  and  KILTER  are  moderately  corre- 
lated among  themselves  with  a  positive 


correlation  coefficient  of   .7286,  and 
that  their  relative  dispersions  (means/ 
standard  deviations)   are  approximately 
equal    (see  Table  9) . 

The  next  set  of  experiments  tested 
LPNET  on  problems  51  through  100.  Notice 
in  Table  9  that  the  summary  statistics, 
that  is,  the  mean  run  time,  standard 
deviation  and  s/x  ratio  for  the  experi- 
ment, are  approximately-'-  equal  to  the 
summary  statistics  for  the  first  50  prob- 
lems.    Since  this  set  is  independent  of 


Table  9 
RUN  TIME  SUMMARY  STATISTICS 


KILTER 

LPNET 

First  50 

X 

=  14.698 

X 

=  5.426 

Problems 

s 

=  5.223 

s 

=  1.750 

s/x 

=  .355 

s/x 
x 

=  0.323 
=  5.459 

Second  50 

s 

=  1.743 

Problems 

s/x 

=  0.319 

the  first  50  problems,  we  are  able  to 
examine  the  distribution  of  sample  means 
of  size  50  for  LPNET  run  time.  Because 
of  the  relatively  large  sample  size,  50, 
the  distribution  of  sample  means  is  nor- 
mal,  regardless  of  the  nature  of  the  dis- 
tribution of  individual  run  times.  The 
sample  mean  is  5.459  which  is  a  point 
estimate  of  the  population  mean  run  time. 
The  estimated  standard  error  of  the  mean 
is  0.249^.     The  small  standard  deviation 
is  a  result  of  the  relatively  large  num- 
ber of  observations  in  the  sample  p.  In 
a  similar  fashion,  a  point  estimate  of 
the  mean  run  time  for  KILTER  is  14.7  and 
an  estimate  of  the  standard  error  of  the 
mean  is  0.75. 

The  distribution  of  sample  means  may 
also  be  used  to  generate  confidence  inter- 
vals on  the  mean  CPU  execution  times  for 


There  are  many  useful  tests  that  may  be 
executed  with  data  obtained  from  two  in- 
dependent simple  random  samples.  For 
example,   inferences  made  using  the  first 
sample  may  be  checked  using  hypothesis 
testing  based  on  statistics  measured  on 
the  second  sample . 

2  "   

Standard  error  of  the  mean  a  =  s/  Vn-1  . 
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the  population  of  problems,   P-j- ,    for  both 
KILTER  and  LPNET.      If  we  denote  by  y^, 
and  Pl  mean  CPU  time  in  seconds  for 

KILTER  and  LPNET  for  the  population  Pt , 
we  can  construct  the  following  95%  con- 
fidence intervals: 

13.238  i  y„  i  16.162 

JS. 


4  .  936  ±  u-^  i 


7.704 


5.916 

<  10.798 

L 


Since  CPU  times  can  be  influenced 
by  the  number  and  type  of  other  programs 
which  were  operating  during  execution,  we 
carried  out  the  following  experiment  to 
measure  the  extent  of  the  variability 
which  can  occur  in  the  DEC-1070  environ- 
ment at  Harvard.     A  single  test  problem 
(456  nodes,   1512  arcs,   cost  range 
[1,1414],   seed  228)   was  solved  by  LPNET 
at  seven  randomly  selected  times  during  a 
one  week  period.     The  resulting  CPU  times 
are : 


{  "  fully  occupied) 


1) 

8.400 

seconds 

2) 

8.  331 

II 

3) 

8.  933 

II 

4) 

8.210 

II 

5) 

7.  966 

II 

6) 

7.  919 

n 

7) 

8.  575 

It 

(  *  empty) 


Thus  from  the  above  data  we  can  expect 
the  maximum  error  in  measured  CPU  time 
to  be  of  the  order  of  13%.     It  should 
be  noted  that  these  variations  are 
caused  by  automatic  "swapping"  of  jobs 
between  central  memory  and  auxiliary 
storage.     This  phenomenon  occurs  when- 
ever a  computer  installation  performs 
in  a  multiprogramming  environment. 

A  linear  regression  analysis  based 
on  the  first  sample  of  50  problems  was 
undertaken  to  determine  the  overall  com- 
bined   effects  of  problem  specifications 
and  performance,   as  measured  by  CPU  time. 
Several  regression  models  were  tried. 
The  most  appropriate  linear  regression 
equation  for  LPNET  was  found  to  be: 

Y/LPNET  =  3q  +  e-j^x^  +  62^2 
where 

Y/LPNET  =  run  time  for  LPNET  in  seconds 
x^  =  number  of  nodes 

X2  =  number  of  arcs 

and  the  estimated  coefficients  are 
given  by 


^0 


=  -4.393  with  a  standard  deviation  of 
.  72 


B,    =  0.01654  with  a  standard  deviation  of 
.00120 

3p  =  0.00255  with  a  standard  deviation  of 
.00038 

The  estimated  standard  deviation  of 
the  residual  for  the  above  regression 


equation  was   .72  and  the  coefficient  of 
determination    (sample  R  )   was  computed  to 
be  0.845.     The  relatively*  small  standard 
deviations  of  the   sample  regression  coef- 
ficients 30'    3]_  and  32  indicate  that  the 
sample  estimates  3o,    Si  and  32  are  sig- 
nificant.    Another  way  of  saying  this 
would  be  that  the  CPU  run  time  of  LPNET 
depends  to  a  significant  extent  on  the 
number  of  nodes  and  number  of  arcs  in  the 
problem  being  solved.     The  high  r2  (.845) 
means  that  84.5%  of  the  variance  can  be 
explained  by  the  size  of  the  problem  as 
measured  by  the  number  of  nodes  and  arcs. 

Although  the  relationships  are  ap- 
proximately linear  over  this  collection 
of  problems ,  we  do  not  expect  that  a 
strictly  linear  extrapolation  can  be  made 
to  larger  problems.     Caution  should  be 
taken  not  to  use  this  regression  equation 
to  make  predictions  for  populations  other 
than  the  one  considered  in  this  study. 

A  second  regression  model  fits 
kilter's  run  time  to  the  problem  specifi- 
cations with  the  resulting  coefficients: 

Y/KILTER  =  ^0        ^1^1  ^2^2 

where 

Y/KILTER  =  run  time  for  KILTER  in  seconds 

x-j^  =  number  of  nodes 

X2  =  number  of  arcs 

and  the  estimated  coefficients  are  given 
by 


3 


0 


-17.63  with  a  standard  devia- 
tion of  2.63 


3-,    =  0  .  03267  with  a  standard  devi- 
ation of  .00409 

32  =  0.01347  with  a  standard  devi- 
ation of  .001376 

The  estimated  standard  deviation  of  the 
residual  for  the  above  regression  equation 
was  2.601  and  the  coefficient  of  determin- 
ation  (R^)    was  computed  to  be   .767.  This 

* 

The  quantities  of  interest  are  actually: 
4  .  393 


0/a 


0 


2/0, 


.72 

.  01645 
0.0012 

0. 00255 
0. 00038 


=  6.1 


=  13.7 


6  .  7 


Since  these  numbers  are  much  larger  than 
3  and  the  3j  are  approximately  normally 
distributed  we  conclude  that  3o'    3i  and 
32  are  all  significantly  greater  than 
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model  is  not-as  predictive  as  the  pre- 
vious, but  R  remains  relatively  high; 
here,  76.7%  of  the  variance  can  be  ex- 
plained by  the  number  of  nodes  and  arcs. 

When  an  independent  variable  repre- 
senting the  cost  range  was  added  to  this 
model  the  sample         increased  slightly 
to  77.7%.     This  insignificant  improvement 
did  not  warrant  inclusion  of  a  third 
independent  variable.     Hence  the  simpler 
two  variable  model  was  selected. -'- 

From  these  two  regression  models,  we 
see  that  LPNET  seems  to  depend  to  a 
greater  degree  on  the  number  of  nodes  than 
the  number  of  arcs  in  the  network 
(B-|_  =   .  01645,    32  =   .00255),  whereas 
KILTER  depends  upon  the  number  of  nodes 
to  a  much  lesser  degree    (S^  =  .03267, 
32  =   .01347).     Loosely  speaking,  this 
means  that  LPNET  is  more  node-dependent 
than  KILTER,   while  KILTER  is  more  arc- 
dependent  than  LPNET.     This  result  is 
intuitively  appealing  since  the  primal- 
simplex  algorithm  LPNET  works  with  a 
basis  spanning  tree    (m-nodes)   which  must 
be  maintained  --  thus  the  dependence  on 
the  number  of  nodes.     The  out-of-kilter 
algorithm,  on  the  other  hand,  works  pri- 
marily with  arcs  which  are  "out-of-kilter" 
--  thus  the  greater  dependency  on  the 
number  of  arcs. 

Besides  being  used  for  understand- 
ing the  interdependence  of  problem  char- 
acteristics and  program  performance, 
these  regression  analyses  can  be  employed 
for  forecasting  CPU  times  for  problems 
in  the  population  P^ .     This  can  be  done 
as  follows  by  making  point  estimates 
by  substituting  for  xi  and  X2  in  the 
regression  equations  or  by  setting  up 
confidence  intervals  for  the  predicted 
run  times  Y/LPNET  and  Y/KILTER. 

For  example,   a  point  estimate  of 
the  LPNET  run  time  for  a  problem  with 
456  nodes  and  1160  arcs    (see  problem  14) 
would  be: 

Y/LPNET  =  6.058  seconds 

The  actual  run  time  for  problem  14  was 
6.000  seconds.     Here  the  error  in  the 
point  estimate  is  approximately  1%. 

A  98%  confidence  interval  for 
Y/LPNET  would  be: 

''lPNET  "  *.98/2  °y  -  "lPNET  =  ''lPNET  *  ^.98/2  °y 

where  Y^  is  the  point  estimate  run 

^LPNET 

time,    a-  is  the  estimated  residual  stan- 
ard  deviation  of  Y,  and  8  „„       is  the 


In  a  similar  analysis  carried  out  on 
larger  problems,  the  cost  coefficient 
range  became  more  important . 


standard  normal  deviate. 

Hence,    in  our  case  with  98% certainty 

6.  058   -  2.  303  1  .  72)    <  Vj_p^j..j,  <   6.  058  +  2.  303  (.  72),  or 
"•"O  S  ^LPNET  i 

Thus  we  predict  with  98%  certainty 
that  for  problems  with  456  nodes  and  1160 
arcs  and  any  cost  coefficients  falling  in 
the  range  defined  by  our  population  LPNET , 
will  take  no  longer  than  7.716  seconds 
and  no  less  than  4.400  seconds  to  find  an 
optimal  solution. 

Clearly,  many  other  statistical 
tests  could  be  carried  out,  depending 
upon  what  the  specific  objectives  of  the 
experiments  were.     Since  our  aim  is  to 
introduce  a  methodology  rather  than  ex- 
haustively compare  KILTER  and  LPNET,  we 
did  not  conduct  further  testing. 

5.  Conclusions 

The  statistical  analysis  in  Section 
4  shows  that  the  code  LPNET  is  clearly 
superior  to  the  code  KILTER  for  problems 
within  the  collection  Pt •     The  conclusions 
are  not  unexpected  since  LPNET  dominates 
KILTER.     Nonetheless,   we  are  able  to  mea- 
sure precisely  the  extent  of  the  super- 
iority and  propose  confidence  limits  with 
a  statistical  framework.     Despite  this 
analysis,   there  are  many  unanswered  ques- 
tions about  LPNET ' s  superiority  to  prob- 
lems outside  of  P^  but  within  collections 
Pc  and  P.     It  is  an  open  question  whether 
similar  conclusions  would  be  obtained  if 
the  objective  of  the  comparative  study 
was  to  evaluate  how  well  the  codes  reop- 
timize,   that  is,   how  well  the  codes  re- 
start from  a  nearby  basic  feasible  solu- 
tion.    Thus  we  caution  the  reader  to  treat 
these  results  as  conclusive  over  set  Pt» 
but  not  outside  of  this  domain. 

The  primary  purpose  of  this  report 
was  to  develop  an  initial  framwork  for 
analyzing  and  comparing  mathematical  pro- 
gramming software.     Obviously  there  is 
much  unfinished  work  to  be  accomplished. 
A  thorough  study  of  problem  generators 
as  relating  to  real-world  examples  is  an 
important  next  step.     If  the  complexity 
of  real-world  problems  could  be  better 
understood,   it  might  be  possible  to  design 
problem  generators  which  more  closely  re- 
flect the  characteristics  of  realistic 
problems.     The  following  idea  which  pro- 
poses a  synthesis  of  the  two  categories 
of  test  problems  described  in  Section  2 
might  be  useful  for  building  generators. 

Start  by  choosing  a  representative 
test  case  t  from  the  class  of  problems 
under  study,   for  instance,   a  particular 
nonlinear  programming  design  problem. 
Place  perturbations  on  various  parameters. 
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such  that  the  cost  coefficients,  and 
call  the  resulting  population       •  (It 
is  left  up  the  researcher  to  select  the 
appropriate  parameters  and  ranges.) 
From  this  population  a  sample  p  is  drawn 
and  inferences  about  Pt  are  made.  Since 
the  population  Pt  is  defined  to  be  with- 
in an  e-neighborhood  of  the  original 
test  case,   conclusions  about  Pt,   in  some 
sense,   reflect  a  sensitivity  analysis 
for  problem  t.     For  instance,   the  stand- 
ard deviation  of  run  time  measures  how 
sensitive  the  performance  of  the  code  is 
to  small  changes  in  the  original  data. 
As  a  secondary  consideration,   it  might 
be  interesting  to  execute  the  programs 
from  different  starting  points. 

These  concepts  fall  within  the  pur- 
view of  experimental  design.     Since  it 
is  important  to  minimize  the  computer 
costs  for  testing,   especially  for  large- 
scale  examples,   a  sound  experimental  de- 
sign can  reduce  the  variances  of  the 
estimates  and  thereby  result  in  lower 
computational  costs.     As  an  example,  the 
simple  random  sampling  procedure  which 
was  performed  in  this  paper  can  be  re- 
placed by  stratified  sampling. 

Another  important  area  for  future 
research  lies  in  developing  a  clearer 
view  of  the  relationships  between  the 
sets  Pt/   Pc  ^'^'^  ■^^  estimate  on  the 

upper  and  lower  bounds  of  CPU  time  for 
problems  within  Pt  might  lead  to  an 
accurate  estimate  for  average  CPU  time 
for  problems  within  Pc •     The  same  might 
be  said  for  P  as  well. 

The  ultimate  benefit  of  the  sta- 
tistical framework  is  for  assessing 
competing  software  codes  or  systems  so 
that  a  systematic  choice  can  be  made  as 
to  the  best  technique  for  a  particular 
user.     Clearly  this  decision  is  a  diff- 
icult multi-attributed  problem  and  until 
these  types  of  decisions  can  be  handled, 
we  must  be  content  to  list  a  profile  of 
characteristics  and  performance  measures 
for  each  code.     From  this  profile,  the 
user  can  select  the  appropriate  tech- 
nique for  his  or  her  particular  needs. 

Finally,  we  should  mention  that  the 
working  committee  on  algorithms    (WCA)  of 
the  Mathematical  Programming  Society  is 
actively  engaged  in  computational-related 
research.      (The  authors  are  members  of 
the  committee.)     The  ambitious  goals  of 
the  WCA  are : 

1)  collect  a  graded  set  of  test 
problems , 

2)  act  as  a  focal  point  for  knowledge 
of  computer  programs  that  are 
available  for  the  same  calculation, 

3)  recommending  "best  buys"  where 
several  techniques  are  available 
for  the  same  calculations. 


4)  encouraging  persons  who  distribute 
programs  to  meet  certain  standards 
of  portability,   testing,  ease  of 
use  and  documentation,  and 

5)  define  guidelines   for  conducting 
and  reporting  computational  exper- 
iences . 

One  of  the  most  important,   and  most  dif- 
ficult,  goals  is  the  last  --  to  define 
journalistic  guidelines.     The  imposition 
of  fair  guidelines  could  markedly  improve 
the  state  of  computational  work  by  pro- 
viding better  information  to  researchers 
and  software  users. 
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Abstract 

We  discuss  different  approaches  to  evaluating 
optimization  routines  and  describe  a  particular 
method  which  uses  parameterized  test  problems.  We 
illustrate  this  approach  through  a  siitple  case 
study  of  three  well  known  unconstrained  optimiza- 
tion routines  applied  to  three  parameterized  test 
problems.    In  particular  we  display  our  results  as 
a  set  of  graphs. 

1.  INTRODUCTION 

Evaluating  mathematical  routines  is  a  diffi- 
cult task,  and  one  that  requires  both  qualitative 
and  quantitative  measures  of  performance.    A  fun- 
damental reqijiremsnt  is  that  the  testing  environ- 
ment simulate  an  actual  environment  of  use  since, 
if  it  did  not,  the  evaluation  would  be  valid,  but 
in  all  likelihood,  irrelevant.    Furthermore,  the 
overall  quality  of  a  code  can  only  be  gauged  after 
investigating  a  broad  range  of  issues,  for  exairple, 
efficiency,  robustness,  usability,  usefulness  of 
documentation,  ability  to  fail  gracefully  in  the 
presence  of  user  abuse,  rounding  error  difficul- 
ties or  violation  of  underlying  assunptions.  A 
testing  method  usually  concentrates  on  efficiency 
and  robustness,  evaluating  these  by  exercising 
the  code  on  a  set  of  wall  chosen  and  hopefully 
realistic  problems. 

Our  aim  in  this  paper  is  to  discuss  sane  of 
the  issues  pertinent  to  the  evaluation  of  opti- 
mization routines,  and  to  describe  an  approach 
viiich  employs  parameterized  test  problems,  first 
introduced  for  the  purpose  of  evaluating  routines 
for  numerical  quadrat-ure  by  Lyness  and  Kaganove 
[1] .    We  describe  in  Sections  2  and  3  aux  par- 
ticular motivation  for  and  usage  of  such  functions. 
To  illustrate  this  approach  three  well  known  opti- 
mization routines  were  evaluated  in  a  sirtple  case 
study.    In  Section  4  and  5  we  describe  the  routines 
and  test  problems,  and  present  the  results  in 
graphical  form.    We  believe  that  this  case  study 
gives  sane  interesting  exarrples  of  how  uncon- 
strained optimization  routines  behave  v^en  applied 
to  parameterized  test  problems.    The  study  is, 
however,  very  limited  in  scope,  and  we  discuss 
soTie  of  these  limitations  in  the  concluding 
section. 

* 

Work  performed  under  the  auspices  of  U.S.  Energy 
Research  and  Development  Administration. 


2.      DIFFERENT  APPROACHES  TO  TESTING 

To  date  the  most  ccmmon  method  of  evaluating 
optimization  routines  has  beccme  known  as  'battery' 
or  'simulation'  testing.    The  first  really  cannpre- 
hensive  study  along  these  lines  is  that  of 
Hillstran  [2] .    Battery  testing  has  two  basic  com- 
ponents, namely,  a  set  of  test  problems  given  by 
an  objective  function  and  a  starting  vector,  and  a 
number  of  measures  of  performance. 

Test  functions  are  chosen  fron  the  literature, 
or  fron  real  life  applications,  usually  because 
they  have  seme  praninent  feature,  such  as  a  curving 
valley  possibly  helical,  a  singular  Hessian  at  the 
minimum,  badly  scaled  variables  or  large  dimension- 
ality.   The  choice  of  starting  points  is,  of 
course,  crucial  to  performance.    Many  routines  per- 
form well  fron  the  standard  starting  points.  How- 
ever, as  discussed  in  [2],  the  use  of  a  number  of 
widely  dispersed  starting  points,  reveals  much 
about  the  strengths  and  weaknesses  of  a  code,  for 
exanple,  how  robust  it  is,  the  suitability  of  its 
convergence  criteria,  or  the  code's  ability  to 
handle  long  searches  through  non-quadratic  regions. 

A  variety  of  different  measures  of  performance 
have  been  used.    The  most  ccftmon  measure  is  the 
number  of  calls  to  the  routine  which  develops  in- 
formation about  the  function,  or  the  number  of 
equivalent  function  evaluations;  since  the  cost  of 
gradient  information  is  hardly  ever  the  equivalent 
of  n  function  evaluations  (where  n  is  the  problem 
dimension) ,  the  appropriate  weighting  of  gradients 
leads  to  the  DCU  or  Homer  unit  scheme  of 
Hillstrcm  [2].    Other  measures  include  overhead 
and  average  rate  of  convergence. 

Whilst  battery  testing  yields  very  valuable 
information  about  the  behaviour  of  optimization 
routines,  it  is  nevertheless  subject  to  limita- 
tions vSiich  often  make  a  clear  ranking  of  methods 
difficult  to  discern.    First,  it  is  difficult  to 
know  how  much  confidence  should  be  attached  to  the 
measures  of  performance,  since  slight  variation  of 
starting  point  or  geonetry  of  test  problem  can 
lead  to  substantial  variation  in  a  performance 
measure.    Second,  when  starting  points  are  varied 
substantially,  one  can,  in  effect,  get  very  differ- 
ent test  problans.    For  exairple,  Rosenbrock's 
function  f (x)  =  100(x?-X2)2  +  (l-Xi)^  is 
more  difficiiLt  to  solve  if  started  at  the  point 
(-2,2)  than  if  started  at  (2,2);  in  the  former 
case,  a  routine  must  follow  a  steep  curving 
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valley,  particularly  taxing  at  the  point  (0,0) . 
Thus,  vvhen  evaluating  a  routine's  ability  to  cope 
with  a  particular  feature  it  may  not  be  advisable 
to  average  a  performance  measure  over  widely  differ- 
ing starting  points;  however,  if  no  averaging  is 
done,  one  may  be  swairped  with  numbers. 

In  attenpting  to  deal  with  same  of  these  dif- 
ficulties, v«  have  utilized  parameterized  test 
problems,  introduced  originally  into  the  evaluation 
of  routines  for  numerical  quadrature  by  Lyness  and 
Kaganove  [1].    We  will  not,  however,  use  the 
Lyness-Kaganove  term  performance  profile  testing 
v\4iich  involves  a  somevtet  specific  usage  of 
parameterized  test  problems  to  rank  "black-box" 
mathanatical  routines.    Though  v«  also  want  to  com- 
pare routines,  our  usage  of  parameterized  test 
problems  is  oriented  toward  the  gathering  of  infor- 
mation on  the  performance  of  an  algorithm,  with  a 
view  to  further  developnent  of  its  inplementation. 
Thus  our  bias  is  toward  the  development  of  tools 
to  aid  the  process  of  algorithm  development  rather 
than  tools  to  aid  mathanatical  software  evaluation; 
though  we  are  also  concerned  with  the  latter  it  is, 
in  our  opinion,  a  much  harder  problem.    In  the 
next  section  we  discuss  our  method. 

3.      OUR  APPROACH 

Test  problems  are  chosen  in  vi^ich  a  prcminent 
feature  can  be  varied  by  changing  the  value  of  a 
parameter  A  occurring  in  the  problems  mathematical 
formulation.    We  have  departed  frcm  the  Lyness- 
Kaganove  approach  vdiich  is  to  parameterize  a  prob- 
lem in  such  a  way  as  to  provide  many  different 
'incarnations'  of  a  test  problem,  all  of  approxi- 
mately the  same  level  of  difficulty,  by  parameteriz- 
ing e.g.  the  position  of  the  peak  of  a  unimodal 
function  but  not  its  shape.    Varying  our  parameter 
alters  significantly  the  overall  difficulty 
of  the  problem.    Thus  one  can  test  a  routine's 
effectiveness  with  respect  to  a  particular  fea- 
ture and  study  the  routine's  performance  as  this 
feature  becanes  more  and  more  prominent.  For 
example,  the  feature  may  be  a  steep  curving  valley, 
viiose  steepness  or  curvature  increases  with  A .  At 
the  same  time  measures  of  performance  for  values 
of  A_in  the  neighborhood  of  any  particular  value, 
say  A,  will  give  an  idea  of  the  variability  of  the 
performance  measures. 

Starting  points  are,  of  course,  critical  to 
the  minimization  process  and  often  bias  the  path 
of  search.    In  addition  to  running  each  test  using 
a  conventional  or  fixed  starting  point,  the 
initial  points  were  varied  using  a  random  number 
generator.    The  motivation  for  this  approach  is 
twofold:    1)  it  was  desired  to  obtain  the  "spread" 
between  the  maximum  and  minimum  number  of  calls  to 
FCN  and  thus  determine  how  sensitive  each  routine 
is  to  a  change  in  the  starting  point;  and  2)  by 
obtaining  average  values  and  using  these  for  con- 
parison  purposes,  it  was  hoped  that  the  results 
would  be  less  biased  with  respect  to  the  starting 
points. 

Clearly,  an  iirportant  consideration  is  how 
the  points  should  be  varied.    We  chose  to  vary 
starting  points  within  a  "box"  surrounding  the 
conventional  starting  point  thus  producing  dif- 
ferent initial  points  but  the  same  general 


starting  location  with  respect  to  the  topography. 
A  hypercube  of  dimension  0.1  units  was  used  for  aj 
the  tests  in  which  the  starting  points  were  varied 
Another  important  aspect  is  that  the  same  starting 
points  were  used  for  each  run  (i.e.,  for  each 
value  of  A)  by  resetting  the  random  number 
generator . 

The  statistics  gathered  for  each  test  functicj' 
include  the  following: 

1.  A  table  for  each  optimization  routine  us- 
ing conventional  starting  points.  The 
number  of  FCN  calls,  the  number  of  itera- 
tions, the  solution  point,  gradient,  and 
function  value  for  each  A  are  given. 

2.  A  table  for  each  optimization  routine  us- 
ing the  randan  starting  points.  The 
average,  maximum  and  minimum  number  of  PCi: 
calls,  and  the  average  number  of  itera- 
tions for  each  X  are  given. 

3.  Plots  of  the  maximum,  minimum,  and  avera^ 
number  of  FCN  calls  versus  A  for  each 
routine  used. 

4.  A  superimposed  graph  of  three  plots  (cor-j 
responding  to  the  three  routines)  of  | 
average  number  of  FCN  calls  versus  A.  i 

The  graphs  were  plotted  using  Fortran  subroutines. 

4.      A  CASE  STUDY 

In  order  to  illustrate  these  ideas,  a  sinple 
case  study  was  carried  out  involving  three  routine^ 
and  three  parameterized  test  problans. 

4.1    Routines  Used 

The  following  three  unconstrained  nonlinear 
optimization  routines  were  used  in  this  study. 

1.  cay[IN  -  a  modularized  version  of  the 

f\mction  minimization  Harwell  Library 
routine  VA08A  written  by  R.  Fletcher 
[3]  which  uses  the  Fletcher-Reeves 
version  of  the  conjugate  gradient 
technique  (see  [7]). 

2.  BFGS  -  a  modularized  version  of  the 

Davidon-Fletcher-Powell  quasi-Newton 
function  minimization  algorithm  [4,5] 
with  a  BFGS  (Broyden-Fletcher- 
Goldfarb-Shanno)   [6]  update  to  the 
Hessian  (see  [7] ) . 

3.  OCOPTR  -  a  modularized  inplementation  of 

C.  Davidon's  [8]  optimally  conditioned 
algorithm  for  calculating  the  minimum 
of  a  function  of  several  variables. 

A  user-supplied  subroutine  FCN  which  evaluates  the 
function  value  F  and  the  ccmponents  of  the 
gradient  G  at  the  point  X  is  needed  by  each  of  the 
above  routines. 

In  addition,  the  user  must  supply  the  follow- 
ing information  in  the  calling  program  to  CCMLN: 
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1.  the  number  of  variables  (i.e.  the  dimen- 
sion of  X)  . 

2.  the  initial  estimate  of  the  solution  vec- 
tor X. 

3.  an  estimate  RFN  of  the  expected  reduction 
in  F  (used  on  the  first  iteration  only) . 

4.  the  accuracy  required  in  the  solution,  ej. 
The  first  condition  for  a  normal  return 
is  that  the  differences  between  the  com- 
ponents of  two  successive  estirrates  of 
the  solution  are  less  than  e i . 

5.  an  additional  accuracy  requirement,  ej. 
The  second  condition  for  a  normal  return 
is  that  the  gradient  norm  is  less  than  ej. 

6.  the  limit  on  the  number  of  calls  to  the 
function  evaluation  subroutine  FCN.  If 
the  limit  is  exceeded  before  a  minimum  is 
attained,  CGMIN  will  terminate  abnormally 
with  an  appropriate  error  code. 

For  all  of  the  runs,  RFN  was  set  equal  to  1.0;  ei 
and  E2  were  set  equal  to  10~^.    For  all  three  of 
the  routines,  the  limit  on  the  number  of  FCN  calls 
was  set  at  200,  500,  or  1000  depending  on  the  test 
function. 

The  routine  BFGS  requires  the  following  infor- 
mation from  the  calling  program: 

1.  &  2.     (same  as  for  COCLN) . 

3.  the  accuracy  required  in  the  solution,  e. 
A  normal  return  occurs  if  the  sum  of  the 
absolute  values  of  the  ccnpDnents  of  the 
solution  difference  and  direction  vectors 
are  both  not  greater  than  e. 

4 .  an  estimate  of  the  minimum  value  of  F (X) . 

5.  the  limit  on  the  number  of  calls  to  FCN. 

In  all  cases,  e  was  set  to  10"^  and  the  estimate  of 
the  minimum  was  set  equal  to  0. 

The  necessary  parameters  in  the  calling  pro- 
gram to  the  optimization  routine  OOOPTR  are: 

1.  &  2.     (same  as  for  CGMIN) . 

3.  the  limit  on  the  number  of  FCN  calls. 

4 .  the  accuracy  sought  e .    A  normal  return 
occurs  if  g-'-Hg  <  e,  vAiere  H  is  the  current 
approximation  to  the  inverse  hessian,  and 
g  is  the  gradient  at  the  current  iterate, 

5.  an  estimate  on  the  lower  bound  of  the 
function,  e  was  set  to  10"^,  and  the 
lower  bound  estimate  was  set  equal  to  0. 

4.2    Problems  Selected 

The  following  test  functions  were  parameterized 
and  inplemented  in  the  evaluation  of  the  above  al- 
gorithms (Xq  represents  the  conventional  or  original 
value  of  the  parameter) : 


(Xq  =  100) 


1.    Parameterized  Rosenbrock  test  function 
.2  2.2 

,2 


F  =  (l-X^)"  +  A(X2-X^) 


||_=_2(l-x^)  -4X(X2-X^)' 


%  =    ( V^i) 

c.    conventional  starting  point  is  (-1.2,1.0). 

The  function  F  has  a  global  minimum  at 
(1.0,1.0).    The  random  starting  points  were 
generated  over  the  component  intervals 
X^ (-1.3,-1.1) ,  X2(0.9,l.l) . 

2.    Powell  Parameterized  badly  scaled  function  of 


two  variables  (^q  =  10,000) 


a. 
b. 


^1  "  ^^iV""-'  ^2  "  ^ 
F  =  f2  +  f2 


1.0001 


3F_ 
3X, 


=  2f^(XX^) 


2f2e"^i 


2f2e-^2 


d.    conventional  starting  point  is  (0.0,1.0). 

For  A  =  10**,  the  function  F  has  global  minima 
at  (1.098xl0~5,9.106)  and  (9.106,1.098xl0~5) . 
The  random  starting  points  were  generated  over 
the  ccnponent  intervals  X,  (-0.1,0.1) , 
X2(0.9,l.l).  ^ 

3.    Perturbed  Quadratic  Function 


F  =  X^  +  2X2  +  3 


X^+4XJ 


3F 
3X, 


4         4         4  4 
+  A(X^  +  3X2  +  4X3  +  6Xp 

=  2X^  +  4AX^ 


3F 
SX^ 

3F 
3X, 


ex^  +  16A 


8X.  +  24AX^ 
4  4 


c.    fixed  starting  point  is  (10,20,30,40) 

The  function  F  has  a  global  minimum  at 
(0,0,0,0) .    The  random  starting  points  were 
generated  over  the  cortponent  intervals 
Xi (9. 9, 10.1),  X2(19.9,20.1) ,  X3 (29.9,30.1) , 
X^  (39.9,40.1) . 

All  the  tests  reported  in  this  paper  were  run 
in  double  precision  (~16  significant  digits)  on  an 
IBM  370/195.  The  numerical  underflow  and  overflow, 
and  the  divide  check  interrupts  were  suppressed. 
The  random  starting  points  were  obtained  using  the 
ANL  ^plied  Mathematics  Division  library  random 
number  generator  G552S. 
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5.       TEST  RESULTS 

Grapliical  results  descri±)ed  in  Section  3  are 
given  in  Appendix  1  for  the  above  problans.  (Tal>- 
ular  results  are  not  given  because  of  space 
limitations.)    Only  the  initial  portions  of  the 
graplis  are  reproduced  here,  these  being  sufficient 
to  illustrate  the  points  raised  during  the  dis- 
cussion in  Section  2. 

a)  Parameterized  Rosenbrock's  function 
F(X)  =  (1-Xi)2  +  A(X2-X?)2 

For  A  =  100  this  is  a  well  known,  difficult 
to  minimize  test  function,  which  has  a  descending 
parabolic  valley  as  its  dominant  feature.  The 
starting  point  (-1.2,1.0)  is  chosen  to  bias  the 
search  down  the  valley.    The  parameter  \  was  varied 
from  20  to  1000  in  steps  of  20;  increasing  X  in- 
creased the  difficulty  of  the  problon  by  making  the 
valley  steeper.    The  limit  on  number  of  calls  to 
FCN  was  set  to  200. 

Figure  lA  shows  quite  clearly  that  the  perfor- 
mance of  caviIN  and  BFGS  are  conparable,  with  OCOPTR 
performing  substantially  better.    Figures  IB,  IC 
and  ID  show  the  sensitivity  to  starting  points  of 
each  routine.    In  Figixre  IC  note  in  particular  how 
substantial  the  variation  can  be  for  certain  values 
of  X,  e.g.  for  X  =  120.  Such  a  choice  of  X  in  a 
battery  test  could  suggest  a  very  misleading  rank- 
ing of  the  methods.    Although  we  have  not  atterrpted 
to  track  down  the  reasons  for  the  poor  performance 
of  the  BFGS  method  in  certain  instances,  such  detec- 
tive work  would  undoubtedly  lead  to  inprovements  in 
the  implementation. 

b)  Powell's  Badly  Scaled  Function 

F(X)  =  (AXiX2-l)2  +  (e~^l  +  e"^2  _  1.0001)2 

The  test  problem  is  a  1:rough  shaped  function 
for  Viiiich  X  was  varied  fron  10  to  1000  in  steps  of 
10.    Changing  X  makes  this  function  more  badly 
scaled  and  thus  more  difficult  to  minimize.  The 
limit  on  the  number  of  calls  was  set  to  1000. 

Figure  2A  shows  that  a  ranking  based  upon  one 
particular  value  of  X  can  be  misleading.  For 
X  <  180  the  performance  of  OCOPTR  and  CayLTN  are 
conparable,  with  OCOPTR  scmewhat  superior.  BFGS 
performs  decidedly  worse.    However  as  X  increases 
the  performance  of  CCMN  deteriorates  rapidly. 
Furthermore,  Figs.  2B  and  2D  demonstrate  that  for 
CQ4IN  and  OCOPTR  results  are  sensitive  to  starting 
points  v\hilst  Fig.   2C  shows  that  the  performance 
of  BFGS  is  relatively  insensitive  to  starting 
points. 

c)  Perturbed  Quadratic  Function 

F(X)  =  liX^  +  X(Xi  +  3X2  +  4X3  +  6X4) 

This  test  function  is  a  parameterized  combi- 
nation of  a  quadratic  and  a  quartic  function. 
When  X  =  0,  the  function  is  a  simple  quadratic  and 
is  easy  to  minimize.    However,  as  X  increases,  the 
quartic  beccmes  more  and  more  significant.     X  was 
varied  from  0  to  1  in  steps  of  .025. 

Except  for  the  case  X  =  0,  CCMTN  proved  su- 
perior to  BFGS  for  this  problem.    Again  OCOPTR 
performed  the  best  and  for  X  >  0  was  quite  insen- 
sitive to  the  particular  value  of  X.    Also,  viien 


the  starting  points  were  varied,  the  spread  be- 
tween the  maximum  and  minimum  number  of  calls  was 
very  small  in  all  cases,  indicating  that  the 
initial  point  is  not  extremely  critical  in  this 
case. 

6 .  CONCLUSIONS 

Our  primary  aim  has  been  to  develop  a  testing 
method  and  software  tools  centered  around  ' 
parameterized  test  problans  and  graphical  display 
of  results,  these  being  designed  to  aid  the  process; 
of  algorithm  development.    We  have  illustrated  thisi 
method  and  shown  the  use  of  the  tools  in  a  simple 
case  study.    The  results  of  this  study  provide 
interesting  examples  of  how  optimization  routines 
behave  vdien  applied  to  parameterized  test  problans. 
The  case  study,  of  course,  suffers  frcm  a  number  | 
of  drawbacks  which  must  be  remedied  in  a  practical  | 
evaluation  of  optimization  routines.    For  example:  ' 

a)  Each  routine  studied  employed  a  different  con- 
vergence criterion.    This  introduces  a  lack 
of  uniformity  in  the  corparison. 

b)  A  more  detailed  investigation  should  study 
each  routine's  performance  initially,  at  inter! 
mediate  stages,  and  in  the  final  stage  v^ien 

it  is  near  the  minimum. 

c)  A  broad  set  of  test  problems  should  be  used, 
in  particular  problans  with  large  or  variable 
dimension , 

d)  For  each  test  function  more  than  one  starting 
box  should  be  used  in  order  to  sanple  a  wider 
region  of  the  topography. 
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Legend  for  Figures 

a)  Figures  lA,  2A,  3A:    Average  number  of  calls  taken  over  set  of  starting 
points,  plotted  against  X.    Super inposed  plots  are  given  for  CafflN(*) , 
BFGS((a)  and  OCOPTR  (#)  . 

b)  Remaining  Figures:    For  each  routine,  ffeximum  (+) ,  Average  (0)  and 
Minimum  (X)  number  of  calls  for  set  of  starting  points,  plotted 
against  X. 
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Figure  lA 
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Badly  Scaled  Function  -  Average  Number  of  Calls 
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"ISSUES  IN  THE  EVALUATION  OF  MATHEMATICAL  PROGRAMMING  ALGORITHMS 
PART  2:    A  PANEL  DISCUSSION 


CHAIRMAN 


Richard  H.  F.  Jackson 
National  Bureau  of  Standards 
Boulder 


PANELISTS 


Ron  S.  Dembo,  Yale  University 
Jerome  L.  Kreuser,  World  Bank 
John  M.  Mulvey,  Harvard  University 
Richard  P.  O'Neill,  Louisiana  State  University 
James  J.  Filliben,  National  Bureau  of  Standards 
Harvey  J.  Greenberg,  Federal  Energy  Administration 


Provided  herein  is  a  summary  of  the  comments  made  during  this  panel  discussion. 
Although  every  attempt  was  made  while  summarizing  to  portray  accurately  the 
essence  of  each  contribution,    inaccuracies  will  persist.    Responsibility  for 
all  such  inaccuracies  or  misrepresentations  lies  with  R.  Jackson,  who  prepared 
this  summary  and  functioned  as  chairman  during  the  discussion.    Names  and 
addresses  of  all  contributors  to  the  discussion  appear  at  the  end  of  this 
text. 


R.  JACKSON:    I  would  like  to  welcome  you  to  the 
panel  discussion  on  "Issues  in  the  Evaluation  of 
Mathematical  Programming  Software"  sponsored  by 
the  Committee  on  Algorithms  (formerly  the  Working 
Committee  on  Algorithms)  of  the  Mathematical 
Programming  Society.    The  Committee  on  Algorithms 
was  created  by  the  MP  Society  in  1974  with  the 
charge  to  concern  itself  with  knowledge,  informa- 
tion, communications,  recommendations,  and  other 
actions  on  mathematical  programming  algorithms 
and  testing  methodologies.    The  ambitious  goals 
of  the  Committee  were  identified  to  be: 

0    ensuring  that  there  is  available  to  the 
MP  community  a  suitable  basis  for  comparing 
algorithms  (e.g.,  a  graded  set  of  test 
problems) ; 

0    acting  as  a  focal  point  for  knowledge 
of  computer  programs  available  for 
general  computations; 

0    recommending  "best  buys"  where  several  codes 
are  available  for  the  same  computation,  and; 

0    encouraging  developers  to  meet  certain 
standards  of  portability,  ease  of  use, 


and  documentation  (e.g.,  by  producing 
"standards"  or  guidelines  for  the  repor- 
ting of  computational  results). 

This  session,  then,  is  being  viewed  by  the 
Committee  as  another  opportunity  to  create  a 
dialogue  between  the  committee  and  the  math- 
ematical programming  community.    We  hop ;  to 
focus  the  discussion  today  around  the  question 
of  guidelines  for  the  publication  of  the  results 
of  computational  experiments,  an  item  that  I  just 
mentioned  as  being  one  of  our  goals.    With  that 
then,  I  would  like  to  throw  the  floor  open  for 
questions  either  from  the  audience  of  from  the 
panelists. 

R.  DEMBO:    I  have  a  comment  to  make  on  the  paper 
that  Dick  O'Neill  presented  in  the  morning  session. 
In  that  paper,  he  conjectured  that  randomly  gen- 
erated   problems  tend  to  be  more  difficult  to 
solve  than  real-world  problems.    It  has  been 
my  experience  that  the  reverse  of  that  conjec- 
ture is  true  for  Knapsack  problems.    I  wondered 
if  anyone  else  had  a  similar  experience. 

R.  O'NEILL:  I'll  mention  quickly  that  I  can 
generate  problems  that  will  be  solved  in  one 
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major  iteration  by  the  Dantzi g-Wol fe  convex 
programming  algorithm.    So  you  can  generate 
problems  that  are  very  easy  to  solve.    I  didn't 
want  to  make  that  conjecture  as  strong  as  it 
might  have  seemed,  and  I  am  sure  that  there  are 
classes  or  sub-classes  of  problems  that  have  been 
excluded,  but  some  of  the  work  we  have  done  at 
LSU  indicates  that  there  is  some  truth  to  this 
conjecture.    In  addition,  Darwin  Klingman  and 
Gordon  Bradley  also  agree  there  is  some  truth 
to  this  conjecture.    One  point  that  can  be  made 
however,  is  that  if  you  can  develop  a  problem 
generator  that  generates  difficult  problems,  you 
are  in  a  good  position  to  test  codes  that  will 
eventually  be  run  on  real-world  problems. 

C.  WITZGALL:  In  Jim  Ho's  lecture  today  on  stair- 
case problems,  he  mentioned  that  Martin  Beale  found 
staircase  problems  to  be  rather  more  difficult  to 
solve  than  general,  more  distributed,  linear  pro- 
gramming problems.  This  could  perhaps  be  a  piece 
of  evidence  that  works  the  other  way,  because 
staircase  problems  are  highly  non-random. 

H.  GREENBERG:  But  the  proper  comparison  would  be 
to  have  a  generator  generate  staircase  problems, 
so  you  would  still  be  comparing  staircase  problems 
with  staircase  problems.  It's  just  that  one  has 
random  entries  and  the  other  has  some  real  data. 
You  would  not  compare  a  general  LP  with  a  stair- 
case problem. 

R.  JACKSON:    Is  there  a  generally  agreed  upon 
definition  of  what  a  randomly  generated  problem 
is? 

H.  GREENBERG:     I  think  it  was  at  least  alluded  to 
In  the  presentations  of  O'Neill  and  Mulvey  this 
morning,  where  they  referred  to  specific  structures 
with  controllable  parameters.    But  the  situation 
is  that  you  have  a  set  of  controllable  parameters 
like  sparsity  and  row  counts  and  that  the  randomness 
appears  in  costs  and  right-hand-sides. 

J.  FILLIBEN:     I  would  like  to  make  a  comment  on  the 
relationship  between  the  typical  math  programming 
problem  discussed  earlier  in  the  afternoon  and  a 
measurement  process  that  one  usually  runs  into  in 
scientific  experimentation.    What  you  are  really 
doing  when  you  discuss  what  should  and  should  not 
be  controlled  in  a  mathematical  programming  problem 
generator  is  specifying  a  measurement  process.  A 
well-defined  measurement  process  has  all  of  the 
components  ofthat  process  identified;  for  example, 
the  domain  of  variation.    It  should  be  noted  that 
this  is  different  from  the  simple  injection  of 
randomness  into  a  procedure.    For  example,  in  the 
simplest  situation,  one  lets  everything  vary  with 
nothing  under  control,  and  from  a  statistical  point 
of  view,  you  are  assuming  least,  and  you  end  up 
with  least.    A  better  situation  is  what  statisti- 
cians refer  to  as  a  "constrained  randomization 
problem",  where  certain  of  the  problem  components 
are  removed  from  the  random  domain  and  put  into  the 
deterministic  domain.    The  remaining  components 
that  we  can't  control  we  then  randomize.    In  es- 
sence, you  have  recognized  that  there  are  certain 
problem  parameters  that  are  of  special  interest, 
and  in  this  case  these  are  the  ones  that  are  to 
be  controlled.    What  we  can't  control,  we  randomize. 
Two  of  the  three  talks  that  I  heard  this  afternoon 


dealt  with  situations  that  could  be  described  as 
constrained  randomization.    The  other  dealth  with 
a  completely  random  situation.    But  the  point  that 
I  really  want  to  make  is  that  there  is  a  relation- 
ship between  testing  by  setting  up  problems  in 
order  to  check  out  algorithms,  with  the  corres- 
ponding problem  in  the  scientific  community  of 
defining  a  measurement  process  by  specifying 
exactly  what  conditions  should  be  placed  on  the 
variables  that  we  intend  to  control.    It  has  to 
do  with  much  more  than  just  having  a  random  num- 
ber generator. 

C.  MYLANDER:     It  seems  to  me  that  in  this  discus- 
sion of  using  randomly  generated  or  real -world 
test  problems,  we  haven't  made  it  as  clear  as  we 
should  just  what  the  purposes  of  this  testing  are. 
It  seems  to  me  that  we  are  testing  codes  and 
algorithms  for  two  purposes.    One  is  to  have  a 
code  certified  and  safe  for  the  user  to  use  and 
the  second  is  to  compare  algorithms  for  solution 
strategies  so  that  developers  of  codes  can  decide 
what  is  the  best  solution  strategy  to  build  into 
a  code.    I  would  like  to  hear  that  kind  of  issue 
discussed  a  little  bit  more.    From  the  point  of 
view  of  a  user,  I'm  willing  to  pay  for  computer 
inefficiencies  so  that  I  can  use  a  small  collec- 
tion of  codes  rather  than  a  larger  collection  of 
specialized  codes.    And  then,  on  another  topic, 
the  one  about  efficiency  of  a  code  on  a  particular 
kind  of  test  problem,  I  think  my  experience  is 
that  the  way  a  problem  is  formulated  causes  much 
greater  variance  in  computer  efficiency  than 
either  the  codes  or  algorithms  themselves. 

R.  DEMBO:     I  think  the  two  different  objectives 
you  mentioned,  that  of  designing  safe  codes  and 
designing  robust  codes,  conflict  to  some  extent. 
If  you  want  to  conduct  experiments  in  order  to 
have  better  equipment  for  designing  codes,  you 
should  probably  use  the  ideas  presented  in  Larry 
Nazareth's  talk  this  morning,  because  thereyou 
are  testing  specific  properties  of  an  algorithm. 
On  the  other  hand,  if  you  just  take  arbitrary 
problems  and  generate  them,  you  have  no  idea 
whether  you  are  testing  to  see  if  there  is  a 
very  steep  valley  you  are  going  into  or  what 
happens  if  you  elongate  the  valley.    So,  if  you 
want  to  design  algorithms,  I  would  say  what  you 
should  do  is  pick  problems  by  hand  and  perturb 
them.    If  you  want  to  test  robustness  however, 
you  would  probably  take  as  wide  a  class  of 
problems  as  you  can  generate.    But  that  means, 
of  course,  that  if  you  are  really  interested  in 
testing  robustness,  that's  probably  all  the 
information  you  get. 

H.  GREENBERG:     I  think  there  are  at  least  three 
purposes  for  doing  any  kind  of  experimentation 
along  the  lines  we  have  been  discussing.  The 
first  purpose  has  to  do  with  program  correct- 
ness:   to  see  whether  the  program  is  correct. 
I  really  don't  know  how  to  do  that,  especially 
when  you  consider  that  the  set  of  inputs  that 
absolutely  would  guarantee  correctness  is  larger 
than  the  total  number  of  bit  patterns  in  the  input 
data.    A  second  reason  is  to  do  benchmarking  for 
the  purpose  of  selecting  an  algorithm  that  you  are 
going  to  buy  to  solve  problems.    I  think  the  £- 
perturbation  method  that  Larry  Nazareth  proposed 
is  a  good  approach  to  that  problem.    A  third 


135 


reason  for  conducting  such  experiments  is  to  dis- 
cover the  poorer  aspects  of  a  given  algorithm  so 
that  that  poor  aspect  can  be  fixed,  thereby  yielding 
a  better  algorithm.     I  should  note  that  you  want  to 
be  careful  not  to  gear  algorithm  improvement  towards 
a  worn-out  test  problem,  like  Rosenbrock's  function. 
I  think  too  many  algorithms  have  been  designed  to 
do  well  on  Rosenbrock's  function,  especially  Rosen- 
brock's  algorithm,  but  it  is  not  clear  to  me  that 
using  Rosenbrock's  function  will  necessarily  do 
anything  for  improving  your  ability  to  solve  non- 
linear programming  problems  in  general.    So  I 
think  that  we  have  to  take  great  care  in  designing 
a  collection  of  problems  that  in  some  way  contri- 
butes insights  into  algorithm  improvement. 

C.  MYLANDER:    I  think  we  have  to  check  the  battery 
approach  against  sporadic  selection  of  all  sequences 
of  test  problems  that  are  available  to  code  devel- 
opers . 

H.  GREENBERG:    But  in  that  case  you  only  have  a 
dozen  hand  selected  problems  with  no  indication 
of  their  properties  or  what  happens  when  you  start 
to  vary  things.    I  think  the  generation  approach 
is  more  helpful  in  that  avenues  for  research  in- 
clude getting  a  handle  on  a  reasonably  small  set 
of  parameters  that  is  large  enough  to  capture  every 
aspect  of  problem  structure  that  in  some  sense 
gives  you  reasonable  representation  of  the  real- 
world  class  being  addressed. 

D.  SHIER:     I  wanted  to  follow  along  with  the 
earlier  topic  of  defining  "randomly  generated" 
problems  I  sometimes  feel  a  little  queasy  about 
the  definition  of  what  is  a  randomly  generated 
problem  and  what  one  actually  gets  out.     I'd  like 
to  give  you  an  example.    Suppose  I  want  to  have 
five  randomly  chosen  numbers  that  sum  to  one. 

One  way  to  do  this  is  to  generate  four  of  the 
numbers  that  sum  to  lass  than  one  and  then  sub- 
tract that  sum  from  one  to  get  the  fifth  number. 
Another  way  to  do  it  is  to  generate  five  numbers 
randomly,  and  then  divide  them  by  the  sum.  How- 
ever, that  will  not  give  you  a  uniform  distribu- 
tion over  the  space  of  interest.    My  point  is  that 
I  think  we  have  to  look  very  carefully  at  what 
exactly  constitutes  a  random  problem.    I  also 
think  that  when  you  start  to  specify  side  condi- 
tions, especially  the  Kuhn-Tucker  conditions  for 
a  given  problem,  there  may  be  some  problems  there 
that  I  don't  believe  have  been  addressed  by  those 
who  have  produced  test  problem  generators. 

R.  O'NEILL:     I  agree  with  you.     In  one  of  the 
experiments  that  I  conducted,  the  only  thing  that 
changed  in  a  problem  from  one  run  to  another  was 
that  the  cost  row  was  generated  by  way  of  the 
Kuhn-Tucker  conditions,  as  opposed  to  being  ran- 
domly generated  with  values  between  1  and  100. 
In  one  comparison,  the  Kuhn-Tucker  generated  problem 
took  significantly  more  time,  but  in  another  case, 
the  times  were  about  the  same.    So  I  agree  with  you 
that  we  should  try  to  determine  the  properties  that 
these  test  problem  generators  have,  but  that's  a 
difficult  area. 

J.  FILLIBEN:     I  might  mention  this  point:  that 
there  are  a  wide  variety  of  statistical  tests 
that,  one  can  use  for  checking  various  random 
number  generators. 


R.  DEMBO:    Let  me  add  one  thing  to  this.    When  Johi 
Mulvey  generated  the  test  problems  for  the  paper 
we  presented  this  morning,  we  discovered  that 
another  problem  with  generation  comes  into  the 
type  of  sampling  that  you  want  to  do  as  well.  We 
wanted  to  do  simple  random  sampling,  so  we  had, 
in  some  sense,  defined  our  population.    We  there- 
fore picked  a  problem  at  random  from  that  popula- 
tion.   Now,  we  had  each  random  variable  varying 
uniformly  according  to  certain  distributions  but 
I'm  still  not  sure  if  what  we  did  can  properly 
be  called  simple  random  sampling.    I  wish  I  were 
sure  of  that. 

R.  O'NEILL:    It's  not  clear  whether    test  problem 
generators  have  any  sort  of  inherent  biases  in 
them  and  I'm  not  sure  whether  you  can  ever  deter- 
mine that  conclusively. 

R.  DEMBO:    They  do  have  a  bias:  feasibility. 

H.  GREENBERG:    There  are  other  kinds  of  biases 
too.    There  are  generators  that  may  be  biased 
toward  certain  kinds  of  algorithms,  so  they  would 
necessarily  show  that  kind  of  algorithm  in  a 
favorable  light.    That's  very  little  understood 
and  I  think  one  of  the  interesting  avenues  of 
research  is  directed  toward  understanding  gener- 
ator biases. 

J.  MULVEY:    We  have  to  be  realistic  about  some  of 
these  things.    Certainly  it's  very  important  to 
analyze  the  differences  between  real -world  prob- 
lems and  generated  problems  but  I  think  that  as 
researchers  if  we're  ever  going  to  solve  large 
numbers  of  problems,  we'll  have  to  use  test  prob- 
lem generators.    I  think  that  for  the  future  we 
can  draw  some  parallels  from  some  of  the  other 
scientific  disciplines.    For  example,  there  is 
an  American  Society  for  Testing  Materials  that 
meets  in  this  building,  in  fact,  which  is  res- 
ponsible for  developing  standards  for  testing 
materials.    I  believe  that  is  something  that 
will  eventually  happen  to  software:    it  will  be- 
come an  engineering  function  to  test  programs. 
But  I  think  until  such  a  society  or  institute 
is  created,  we're  going  to  have  to  look  at  these 
portable  generators  for  methods  of  comparing 
our  algorithms.    They're  much  easier  to  use 
than  trying  to  shift  large  amounts  of  problems 
back  and  forth  between  researchers.    I  just 
don't  think  that's  realistic  at  this  time. 

R.  DEMBO:    But  that  doesn't  detract  from  testing 
the  statistical  properties  of  test  problem  gen-  . 
erators.  ' 

J.  MULVEY:    No,  in  fact  we  should  look  more  at 
the  statistical  properties  of  generators  be-  ;| 
cause  I  think  that's  the  way  the  research  is  I 
going  to  go. 

i 

D.  SHIER:    Let  me  add  a  comment  to  that.    At  one  " 
time  we  had  a  summer  student  who  was  doing  some 
work  on  scheduling  problems  and  had  occasion  to  us 
one  of  the  standard  pseudo-random  number  generator 
Turned  out  that  whenever  the  number  of  items  being 
scheduled  was  divisible  by  2,  very  unpredictable  i 
results  occurred.    For  example,  regardless  of  the 
number  of  items  being  scheduled,  in  some  cases 
he  had  hundreds  of  classes,  it  all  could  be 
scheduled  within  two  hours.    It  turned  out  that 
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2  was  the  seed  of  the  random  number  generator.  So 
that  sometimes  it  comes  and  hits  you  on  the  head 
that  it  really  is  necessary  to  dig  down  and  look 
at  the  random  number  generators  that  we're  using. 

J.  FILLIBEN:    This  brings  up  the  point  that  a  given 
random  number  generator  is  only  "random  enough"  by 
comparison  to  the  purpose  for  which  it  is  to  be 
used;  and  it  sound  like,  for  mathematical  program- 
ming test  problem  generation,  there  are  some  strong 
dependencies  on  very  subtle  properties  of  random 
number  generators  that  are  being  used.  Furthermore, 
these  properties  may  in  fact  depend  on  the  algorithm 
on  which  the  problem  is  to  be  run.    So  that  a  given 
random  number  generator  may  be  alright  for  a  certain 
type  of  algorithm  but  not  alright  for  another  type 
of  algorithm  in  the  worst  possible  case.    So  getting 
back  to  my  original  point,  I'd  like  to  emphasize 
that  whether  a  given  set  is  random  enough  really 
depends  on  for  what  purpose  that  set  is  to  be  used. 
This  in  turn  implies  a  need  for  a  thorough  under- 
standing of  the  various  kinds  of  mathematical  pro- 
gramming algorithms  and  the  properties  of  each  algo- 
rithm as  they  relate  to  test  problem  generation.  I 
don't  think  that  such  understanding  exists  yet. 

R.  JACKSON:    This  appears  to  indicate,  then,  that  in 
addition  to  investigating  algorithm  performance  on 
particular  classes  of  problems,  we  should  also  in- 
vestigate the  correlation  between  a  particular  algo- 
rithm and  the  particular  random  number  generator 
used  in  the  test  problem  generator  that  produced 
the  problems  on  which  an  algorithm  will  be  evalu- 
ated.   And  if  this  is  true,  then  we  have  identified 
another  avenue  for  research  in  evaluation  metho- 
dology. 

R.  DEMBO:    I  would  like  to  suggest  that  we  move  on 
to  an  important  topic  with  immediate  consequences: 
the  issue  of  guidelines  for  publications.    I  think 
it's  very  important  that  we  tighten  up  the  criteria 
used  for  selecting  articles  for  publication  when 
they  contain  computational  results.    I  would  like 
to  put  it  out  as  an  open  question  for  the  audience 
to  inform  us  if  they  have  any  suggestions  as  to 
what  guidelines  could  be  implemented  immediately. 
Certain  guidelines  are,  to  my  mind,  obvious.  For 
example,  specifying  a  compiler  when  you  run  a  prob- 
lem; specifying  convergence  criteria  when  dealing 
with  nonlinear  programs,  but  I  could  go  on  and  on. 

R.  JACKSON:    This  general  issue  of  guidelines  is 
a  very  difficult  one  and  I  agree  with  Harvey  that 
trying  to  discuss  the  general  question  of  publica- 
tion guidelines  might  be  difficult.  However, 
certainly  there  are  aspects  of  that  question  that 
we  could  get  into, 

J.  MULVEY:     I  think  we  could  go  right  to  the  ques- 
tion of  test  problems.     For  example,  should  we 
require  researchers  to  provide  test  problems  to 
other  researchers  when  they  publish  results?  I 
think  that  is  a  question  that  we  can  pose  for  re- 
produci  bi 1 i  ty . 

R.  JACKSON:    Before  we  move  into  the  discussion  of 
the  collection  of  test  problems,  I  would  like  to 
provide  a  little  background.     In  the  first  place, 
we  are  assuming  that  in  the  absence  of  any  other 
well  accepted  method  of  comparing  one  algorithm 
against  another,  the  test  battery  method  is  a 
valid  one.    Furthermore,  given  the  fact  that  the 
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test  battery  method  is  an  acceptable  method,  then 
there  ought  to  be  produced  a  well  accepted  set  or 
perhaps  even  a  graded  set  of  test  problems  that 
are  to  be  used  by  all  researchers.    So  the  question 
becomes  then,  how  can  such  a  set  of  test  problems 
be  created?    This  has  been  looked  into  by  a  number 
of  groups  over  the  years;  SHARE  did  that,  Wolfe 
did  that,  along  with  many  others.    The  question 
confronting  the  committee  now  is  what  kinds  of 
ways  are  available  to  produce  such  a  graded  set 
of  test  problems?    One  suggestion  is  to  require 
that  anyone  publishing  a  paper  containing  a  com- 
putational comparison  of  codes  should  make  those 
codes  available  to  all  other  researchers  and 
perhaps  even  a  central  facility  should  be  created 
for  storing  them. 

H.  GREENBERG:     I  think  the  question  that  has  now 
been  brought  to  the  floor  is  not  guidelines  for 
running  tests,  but  guidelines  for  publication  of 
tests.    We  are  narrowing  the  subject  to  just  the 
issue  of  publication  criteria  where  the  contribu- 
tion hinges  on  the  claim  of  a  better  algorithm. 
One  of  the  proposals  then  is  that  no  one  should  be 
allowed  to  publish  results  claiming  better  perfor- 
mance of  an  algorithm  on  a  set  of  problems  that 
are  not  in  the  public  domain.    More  specifically, 
whatever  problems  have  been  used  by  researchers 
must  be  made  available  either  for  the  sake  of 
reproducability  by  referees  or  for  other  research- 
ers to  try  out  their  methods  on  the  same  set  of 
problems.     In  short  then,  it's  the  publication 
issue  that's  been  put  on  the  table,  not  the 
general  issue  of  guidelines. 

R.  O'NEILL:    Can  I  confound  the  problem?  How 
about  providing  the  code  itself? 

R.  DEMBO:     I  suggest  that  we  attack  the  easier 
things  first  before  we  get  on  to  the  trickier 
problems  like  exactly  which  guidelines  you  would 
suggest  for  publication.    There  are  certain  stan- 
dards that  could  be  implemented  immediately.  For 
example,  reproducibility:  a  referee  should  be  con- 
vinced that  an  experiment  could  be  reproduced. 
That's  an  example  of  what  I  think  is  an  important 
cri  teri  on. 

R.  O'NEILL:     I  could  add  that  I  think  all  of  the 
important  controllable  parameters  should  be  in- 
cluded like  machine,  compiler,  operating  system. 
It  is  not  necessary  to  pile  up  the  paper,  but 
this  could  be  included  in  an  appendix. 

H.  GREENBERG:     I'd  like  to  point  out  that  in  the 
early  days  of  the  development  of  the  SIMPLEX 
method,  if  very  severe  restrictions  had  been 
placed  on  the  publication  of  results,  it  may 
never  have  gotten  published. 

J.  GILSINN:     In  these  days  of  page  limits  per 
article,  trying  to  include  a  listing  of  a  computer 
code  is  probably  not  a  realizable  idea,  but  per- 
hpas  it  is  possible  to  require  that  enough  back- 
ground information  be  included  in  such  an  article 
that  would  then  allow  reproducibility  if  one 
established  contact  with  an  author.  Another 
choice  would  be  to  provide  some  central  reposi- 
tories where  the  codes  could  be  obtained.    But  I 
think  there  is  a  conflict  here  that  must  be 
resol ved. 
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H.  GREENBERG:     I  think  you  have  a  line  drawing 
problem  because  some  codes  have  thousands  of  lines 
and  it  is  not  possible  to  include  much  information 
about  serious  large  scale  systems. 

J.  6ILSINN:    Yes,  but  I  also  want  something  more 
than  just  so  and  so's  algorithm.     I  want  informa- 
tion about  how  that  algorithm  was  implemented.  I 
want  to  know  information  about  what  research  tech- 
niques were  used. 

A.  WILLIAMS:    Both  problems  have  been  handled 
simply  by  having  the  author  include  a  statement 
that  a  listing  of  his  code  is  available.    A  number 
of  the  journals  are  encouraging  authors  to  do  this 
kind  of  thing. 

H.  GREENBERG:    What  do  you  do  then  about  proprietary 
systems  where  it's  informative  to  get  into  a  journal 
some  analysis  but  where  the  owners  of  that  system 
are  not  going  to  release  their  proprietary  informa- 
tion about  the  code  in  the  form  that  you're  dis- 
cussing,   Most  certainly  they're  not  going  to  make 
available  source  listings  of  such  systems  as  MPSX. 
Can  we  eliminate  that  from  the  journals? 

A.  WILLIAMS:    In  my  opinion,  seriously,  we  wouldn't 
eliminate  it,  but  we  would  charge  them  an  advertising 
fee. 

H.  GREENBERG:     I  think  it's  not  a  binary  situation. 
I  think  that  there  is  much  useful  information  that 
can  be  contained  in  an  article  without  requiring 
that  a  listing  be  associated  with  it. 

R.  DEMBO:    When  I  originally  mentioned  this,  I 
di dn ' t  have  in  mind  a  requirement  that  authors 
submit  listings.    My  concern  was  with  requiring 
referees  somehow  to  convince  themselves  that  given 
a  listing,  they  could  reproduce  the  results.  They 
had  to  convince  themselves  of  the  integrity  of  the 
authors . 

H.    GREENBERG:     I  don't  think  that  integrity  is  the 
issue.  I  think  the  transmittal  of  information  is 
the  issue. 

E.  HELLERMAN:    In  this  connection,  what  worries  me 
is  that  when  comparisons  are  made,  they're  made 
against  your  code  and  someone  else's  code.  Very 
frequently,  it's  someone  else's  code  that  is  at 
stake  and  in  this  case  we  don't  know  what  kind 
of  implementation  that  person  has  of  that  other 
person's  code.    This  is  worrisome. 

R.  JACKSON:     I  think  we  should  keep  in  mind  the 
purpose  for  requiring  additional  information  about 
algorithm  or  code.    We're  seeing  lately  that  it  is 
no  longer  sufficient  to  run  a  few  problems  on  two 
codes  and  report  overall  CPU  times  which  are  then 
used  to  support  claims  of  superiority.     It  is  be- 
coming more  and  more  important  to  understand  what 
is  going  on  in  the  interior  of  the  codes  that  are 
being  compared.    Knowledge  of  pivot  strategies, 
for  example,  is  important.    And  if  you  deal  with 
proprietary  codes  then  there  is  no  satisfactory 
way  to  get  information  about  what  kinds  of  tech- 
niques are  used  in  the  algorithm  of  which  that 
code  is  an  implementation.    In  this  case  then, 
we're  left  in  a  situation  where  that  code  simply 
cannot  be  compared. 


J.  MULVEY:    If  you  really  want  to  be  strictly 
scientific,  I  think  you  have  to  have  a  code  avail- 
able because  we're  testing  codes,  not  algorithms. 
I  think  we've  thrown  about  the  words  algorithm, 
code  and  software  a  little  haphazardly  and  that 
in  our  experiments  we  deal  with  codes  and  make 
inferences  about  algorithms.    So  that,  if  we're 
going  to  be  scientists,  those  codes  should  be 
available.    However,  I  don't  think  that  is  a  feas- 
ible suggestion  right  now.    There  will  be  people 
who  are  unwilling  to  make  their  codes  available 
and  that  would  just  shut  off  what  fearful  little 
results  we  have  now,  at  least  reduce  it  quite  a 
bit.    I  think  we  can  work  slowly  toward  requiring 
more  and  better  information  to  be  provided  either 
through  an  appendix,  or  by  way  of  the  authors 
themselves.    Our  problem  then,  is  to  determine 
how  to  increase  the  informational  flow. 

H.  GREENBERG:    I  think  it  has  to  be  acknowledged 
though,  that  there  is  a  trade-off.    The  goal  is 
to  increase  the  transmittal  of  useful  information 
but  if  you  impose  restrictions  saying  "You  will 
report  this",  that  doesn't  necessarily  result  in 
an  increase  of  information  because  the  reply  can 
be  "No,  I  won't." 

R.  JACKSON:    The  situation  then  becomes  what  Al 
suggested  where  authors  should  be  charged  for 
advertising.    Papers  would  be  appearing  that  are 
essentially  a  claim  of  superiority  with  no  sup- 
porting information  allowing  a  replication  of 
the  experiment  or  even  the  checking  of  the  types 
of  strategies  and  techniques  used  within  a  code. 
I'm  not  saying  that  is  not  transmitting  more 
information.    I'm  simply  saying  as  a  code  com- 
parison or  a  claim  of  superiority,  there  is  much 
room  for  improvement. 

J.  MULVEY:    I  think  there's  an  analogy  to  be  drawi 
in  consumer  unions  in  the  sense  that  they  go  out 
and  test  and  break,  and  produce  good  results  on 
that  empirical  evidence.    In  our  case  we  could 
have  a  group  that  goes  to  a  particular  installa- 
tion with  a  battery  of  test  problems  and  says 
"Here,  solve  it  on  your  machine,  on  your  algo- 
rithm, etc."    The  big  question  is:    who's  going 
to  do  that  work?    That's  not  clear. 

E.  HELLERMAN:  I  think  a  bigger  question  is:  , 
who's  going  to  pay  for  it?  I 

C.  MYLANDER:    The  suggestion  that  ought  to  be  put 
forward  is  that  requirements  for  publication  ough 
to  be  that  the  test  problems  and  the  hour  that  th 
test  was  run  are  fully  specified.    I  don't  think 
that  it's  necessary  to  specify  the  codes  also.  I 
this  way,  if  I  have  a  code  to  solve  a  class  of 
problems,  I  could  run  on  exactly  the  same  test  ,i 
problem  if  I  wanted  to  go  out  and  rent  time  on  | 
that  same  computer  and  run  under  the  same  opera-  j 
ting  system  so  that  a  valid  comparison  could  be  j 
made.  1 

J.  MULVEY:    I  think  he  used  an  adjective  that 
forces  you  into  the  situation  of  sending  out  code 
and  that  was  that  the  experiment  must  be  fully 
described.    Fully  described  means  that  you  have 
to  have  a  code.    You  can  summarize  it,  certainly, 
but  you  will  be  losing  information. 
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ur\IKNOWN:    It  seems  to  me  that  we  have  two  different 
items  here;  algorithms  and  codes,  and  that  they're 
completely  different.    If  someone  wants  to  say  "I 
have  a  code  and  it's  a  good  one  but  I  don't  want  to 
tell  you  what's  in  it"  that's  fine,  so  long  as  he 
presents  the  problems  that  he  ran  and  presents  his 
tests  along  with  the  environment  in  which  he  ran 
it.    If  it's  an  algorithm  he's  pushing,  it's  a 
different  story.    He  must  describe  the  algorithm 
behind  the  code  in  detail  to  make  it  valid,  and  he 
must  also  provide  a  listing. 

H.  GREENBERG:    I  think  a  good  example  is  the 
Hellerman-Rarick  invert  routine,  where  the  invert 
code  was  not  made  available  for  free,  but  it  was 
nevertheless  a  valuable  contribution  to  the  litera- 
ture to  have  it  published. 

A.  WILLIAMS:    Why,  though,  would  something  like 
that  be  published  as  a  scientific  paper  rather 
than  as  an  advertisement  or  flyer? 

H.  GREENBERG:    I  don't  think  the  name  "Management 
Science  Systems"  meant  anything  to  someone  reading 
the  article  so  I  don't  think  it  was  an  advertise- 
ment.   I  think  rather  it  was  a  major  contribution 
to  understanding  how  reinversion  should  be  done. 

A.  WILLIAMS:    I'm  not  familiar  with  the  details. 
Did  they  speak  broadly  about  what  was  done,  not 
specifical ly? 

H.  GREENBERG:    No,  they  gave  specific  algorithms 
but  the  implementation  of  the  algorithm  and  the 
code  was  not  specified  and  that  makes  a  big  dif- 
ference when  you  try  to  duplicate. 

A.  WILLIAMS:     I  guess  I  don't  understand.  When 
there's  some  kind  of  a  mathematical  procedure, 
unless  you  knew  the  real  trick  when  coding  it, 
you  mo'-^n  you  could  net  code  it? 

H.  GREENBERG:    No,  you  could  code  it.    I  coded  it. 
But  you  almost  certainly  don't  have  as  good  an 
implementation  as  the  one  Dennis  Rarick  had.  So 
that  if  you  tried  to  compare  MPSX's  invert  routine 
with  your  homegrown  code,  you  won't  necessarily 
be  comparing  like  items  unless  you're  as  skilled 
a  programmer  as  Dennis  Rarick  was.    Very  few 
people  are,    Nevertheless,  the  whole  algorithm 
is  there  and  there's  enough  information  for  you 
to  code  it.    On  the  other  hand,  it's  not  just  an 
algorithm.    It  offered  a  whole  new  concept  of 
looking  at  reinversion.    The  idea  of  a  spike  was 
introduced  in  that  paper  which  has  become  a  class- 
ical paper  in  the  literature  of  mathematical 
programming  systems.    Much  subsequent  research 
has  taken  place  because  of  it. 

J.    MULVEY:    Why  does  the  company  allow  you  to 
publish  an  algorithm  which  gives  the  essence  of 
an  idea  and  not  publish  the  software.     It  seems 
to  me  that  both  the  idea  and  the  implementation 
are  proprietary. 

H.  GREENBERG:    It's  for  precisely  that  reason. 
Because  the  full  power  of  the  method  rests  upon 
such  clever  coding  that  they  were  unafraid  of 
the  competition  and  allowed  it  to  be  published. 


E.  HELLERMAN:    Actually,  P3  was  described  in  terms 
of  ALGOL.    There  were  ALGOL-like  statements  descri- 
bing every  facet  there.    Also,  I've  gotten  reports 
from  a  number  of  universities  stating  that  they 
implemented  the  algorithm  strictly  on  the  basis  of 
what  we  had  written  in  that  report,  P3.    It  was 
clear  cut  that  they  could  write  their  own  code 
and  have  it  working  almost  immediately. 

M.  GUTTERMAN:    Actually  the  code  itself  is  depen- 
dent not  only  on  the  machine,  but  on  the  data 
structure  with  which  you  happen  to  choose  to  rep- 
resent your  LP  matrix  and  LP  inverts. 

UNKNOWN:    We're  actually  introducing  a  third  prob- 
lem here  because  the  code  structure  is  another 
dimension  completely. 

H.  GREENBERG:    Absolutely.    You  run  into  that  all 
the  time  where  a  major  revision  in  the  data  struc- 
ture has  a  greater  impact  on  the  resulting  per- 
formance of  a  code  than  a  major  change  in  the 
algorithm  tactics. 

M.  GUTTERMAN:     I  can  testify  to  that  one  personally 
since  I've  been  involved  in  looking  at  the  results 
of  three  separate  implementations  of  the  Kalan 
Matrix  Packing  Ideas  on  the  same  computer. 

R.  JACKSON:    Perhaps  we're  getting  to  the  point 
where  we  ought  to  broaden  our  definition  of  what 
exactly  an  algorithm  is. 

R.  DEMBO:    No,  I  think  my  original  comments  re- 
ferred not  to  whether  an  algorithm  was  published 
in  coded  form  or  not.    I  agree  with  the  comment, 
by  the  way,  that  if  you  publish  the  code  it  would 
mean  less  than  publishing  a  mathematical  descrip- 
tion.   But  let's  take  the  paper  P3  as  an  example. 
If  claims  were  made  in  that  paper  that  the  new 
P3  invert  procedure  was  much  better  than  what  had 
been  done  previously,  the  question  then  becomes 
whether  enough  information  was  given  in  that  paper 
to  allow  those  claims  to  be  tested  by  someone  else 
willing  to  code  the  invert  procedure.    It  must  be 
understood  that  these  comments  are  made  with  the 
understanding  that  this  new  researcher  might  be  a 
worse  programmer  than  Dennis  Rarick  was. 

J.  GILSINN:     I'd  like  to  go  back  to  a  previous 
topic  of  conversation.    Are  you  thinking  in  this 
set  of  guidelines  of  not  allowing  a  paper  to  be 
published  if  it  presents  computational  results 
about  a  code  that  is  a  proprietary  one?  Because 
it  seems  to  me  that  users  of  codes  are  most  inter- 
ested in  knowing  the  performance  of  a  particular 
code  against  one  of  the  better  known  codes  and 
very  often  these  better  known  codes  are  the 
proprietary  ones.    Whose  going  to  restrict  that 
kind  of  a  situation? 

H.    GREENBERG:    I  think  the  correct  answer  is  that 
we  don' t  know. 

R.  JACKSON:    The  issue  is  getting  a  little  more 
complicated,  and  I'm  not  sure  we're  defining  our 
terms  very  well.    Our  original  topic  of  conversa- 
tion was  the  development  of  a  set  of  guidelines 
for  the  publication  of  computational  results  where 
the  ultimate  aim  is  to  see  better  comparisons  made. 
We  went  off  into  the  tangent  of  discussing  whether 
the  code  itself  should  be  published  as  a  result 
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of  the  comment  that  if  you  don't  see  the  code  you 
can't  do  a  proper  comparison.    Another  issue  has 
been  raised,  however,  that  in  the  absence  of  com- 
plete listings  of  the  code  for  whatever  reason, 
proprietariness  or  unwillingness  of  journal  editors 
to  include  listings,  is  there  anyway  that  a  fair 
computational  comparison  can  be  made  about  codes? 
I  would  like  to  see  that  topic  discussed  moreso 
than  the  current  topic  of  whether  codes  should  be 
published  or  not.     It  is  clear  that  proprietary 
codes  are  not  going  to  be  published. 

W.  ORCHARD-HAYES:    But  related  to  that  question  is 
the  argument  that  there  is  no  such  thing  as  repro- 
ducable  complicated  coding.    It  depends  on  too 
many  factors  including  the  style  of  the  author, 
the  compiler  you're  using  and  whatnot. 

R.  DEMBO:    I  didn't  mean  reproducing  code,  I  meant 
reproducing  experiments. 

W.  ORCHARD-HAYES:    No,  I  mean  reproducing  results. 

R.  DEMBO:    Well,  I  think  you're  right.    There  are 
a  lot  of  different  factors  that  enter  into  it,  but 
you  should  be  able  to  attack  the  same  problems, 
using  the  same  tolerance  criteria  as  the  authors 
did  and  produce  the  same  results. 

W.  ORCHARD-HAYES:     I  guess  the  way  I'm  reading  it, 
it's  a  question  of  relevance.    What  difference  does 
it  make? 

R.  JACKSON:  The  answer  to  that  revolves  around 
the  question  of  replication  as  a  necessary  part 
of  scientific  endeavor. 

R.  DEMBO:    The  point  is  that  you're  making  infer- 
ences . 

H.  GREENBERG:     I  thin!   the  payoff  is  a  bit  better 
than  tRat  now.    I  think  it's  hard  to  understand 
when  there  are  no  controls.    If  one  person  runs 
an  experiment  with  one  problem  and  someone  else 
runs  another  experiment  with  another  problem, 
there  is  no  way  of  knowing  whether  you've  made 
any  progress.    I  think  the  value  of  reproducibility, 
to  the  extent  that  it's  feasible,  is  that  it  does 
give  you  a  measure  of  progress. 

J.  MULVEY:    I  think  there  is  some  value  in  saying 
why  one  code  is  better  than  another  in  that  when 
someone  else  tries  to  duplicate  the  experiment 
they  can  at  least  have  some  systematic  way  of 
trying  to  get  the  same  kind  of  results. 

H.  GREENBERG:     I  think  it's  true  for  the  opposite 
reason,  namely  to  inject  the  objectivity  that  make 
it  less  dependent  on  the  judges.    I  think  that's 
the  point  of  reproducibility. 

E.  HELLERMAN:     I  think  there's  an  awful  lot  of  pure 
artistry  here,  artistry  in  implementation.    Now,  I 
would  stack  anything  that  Bill  writes  against  any- 
thing that  anybody  else  writes,  and  I  know  it's 
going  to  be  better  because  Bill's  an  artist  at 
this  kind  of  thing.     It's  like  looking  at  a  painting 
of  the  same  landscape  by  two  different  artists. 
They're  going  to  look  a  little  different  no  matter 
how  hard  they  try  to  be  the  same.     I  don't  know  if 
you  can  develop  a  criteria  for  pitting  one  against 
another  to  determine  which  one  is  better. 


R.  JACKSON:    I  think  I  would  like  to  disagree  with 
you  on  that.     I  agree  with  you  that  there's  an 
incredible  amount  of  artistry  in  producing  some 
of  these  codes,  but  artistry,  I  think,  once  it  is 
around  long  enough  and  used  over  and  over  again 
is  no  longer  artistry.    It  becomes  documented 
fact.    Let's  take  list  structures  as  an  example. 
At  some  point  in  time  years  ago,  how  data  were 
stored,  organized  and  retrieved  was  almost  a 
black  art.    But  now  techniqjes  for  organizing  data 
and  retrieving  it  appear  in  textbooks.    And  this 
is  getting  back  to  what  I  mentioned  awhile  ago, 
that  it  might  be  necessary  to  expand  our  defini- 
tion of  an  algorithm  to  include  such  items  as 
specific  coding  tricks. 

W.  ORCHARD-HAYES:  Why  not  just  say  "list/structure" 
then?    Why  isn't  that  sufficient? 

R.  JACKSON:    My  point  here  is  that  there  are 
probably  other  aspects  of  what  is  now  artistry 
that  should  be  included  in  this  expanded  defini- 
tion of  an  algorithm. 

H.  GREENBERG:    I  agree  that  what  was  art  ten  years 
ago  may  be  partially  science  now. 

U.  ORCHARD-HAYES:    I'd  like  to  make  one  more  point 
about  comparing  codes  and  that  point  has  to  do  with 
portability.    I  just  can't  believe  that  it's  pos- 
sible to  carry  one  code  from  one  machine  to  another 
machine  and  compare  it.    Codes  just  aren't  that 
portable  anymore. 

H.  GREENBERG:    Right.    We  have  identified  machine 
characteristics  as  a  set  of  variables  to  be  reck- 
oned with  in  the  design  of  an  experiment.    I  think 
through  scientific  investigation  we  can  get  a  much 
better  understanding  of  the  various  computer  and 
algorithmic  effects  on  the  results  of  an  evalua- 
tion.   That,  of  course,  is  the  point  of  the  inves- 
tigation:   to  understand  the  effects  of  the  vari- 
ables.   It's  my  feeling  that  there  will  be  some 
interesting  results  from  the  work  done  recently 
by  Dick  O'Neill.     I  believe  there  will  be  some 
very  strong  and  counter-intuitive  results  from 
that  work. 

M.  GUTTERMAN:     I  personally  don't  believe  in  prin- 
ted publication  of  codes.    I  think  that  code 
aval  1 abi 1 i ty  by  publication,  the  creation  of  a 
collection  and  dissemination  center  for  machine 
readable  code,  is  a  valuable  contribution  in  many 
cases.    But  despite  the  artistry  of  Bill  Orchard- 
Hayes's  coding,  I've  never  gotten  much  benefit 
from  having  a  listing  of  it  in  front  of  me. 

R.  O'NEILL:    You  know,  if  you  chose  the  test  prob- 
lems correctly,  you  could  actually  design  an  experi-' 
ment  to  test  one  person's  artistry  in  coding  against 
another's.  You  could  in  fact  make  inferences  about 
arti  stry . 

UNKNOWN :    This  sort  of  thing  is  being  done  and  re- 
sults have  been  reported  in  a  book  called  The 
Psychology  of  Computer  Programming.    The  experiments 
were  designed  to  test  whether  a  code  produced  by 
different  people  varied  according  to  the  stated 
goals.    It  made  a  tremendous  difference  whether 
the  stated  goals  were  to  produce  code  as  quickly 
as  possible  or  whether  the  code  was  meant  to  be 
as  efficient  as  possible.    In  any  event,  this  kind 
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of  experiment  is  being  done  and  the  results  are 
very  interesting. 

E.  HELLERMAN:    I'd  like  to  point  out  something  here 
about  characteristics  of  computer  programming.  I 
actually  recall  an  instance  where  one  of  the  oil 
companies  was  approached  by  Bonner  and  Moore  and 
informed  that  their  problems  could  be  running  in 
one  third  the  time  that  they're  currently  taking. 
Bonner  came  to  our  installation  to  run  it  and 
spent  a  lot  of  time  fiddling  around  with  the  formu- 
lation of  the  problem  until  he  finally  met  his 
objective  of  solving  the  problem  in  one  third  the 
time  that  it  was  currently  taking  for  that  oil 
company.    But  the  reason  that  he  was  successful 
is  that  he  knew  enough  about  the  characteristics 
of  his  code  and  what  is  required  in  the  way  of  the 
characteristics  of  the  formulation,  in  order  to 
take  maximum  advantage  of  his  code  characteristics. 
So  these  are  a  few  other  things  one  runs  up  against 
when  trying  to  measure  code  performance. 

M.  GUTTERf'^AN:    Another  question  to  be  answered 
however,  is  what  would  have  happened  if  they  had 
taken  the  new  formulation  of  the  problem  and  put 
it  on  LP90  -  would  it  have  run  faster  in  that 
case? 

E.  HELLERMAN:    Another  important  point,  though,  is 
that  Bonner  knew  enough  about  the  oil  company's 
problem  to  say  that  if  you  can  live  with  an  error 
larger  than  was  being  allowed  by  LP90,  you  can  get 
the  kind  of  reduction  in  time  that  he  was  talking 
about.    And  he  indeed  came  up  with  a  solution  that 
had  an  error  at  the  maximum  rate,  a  rate  that  would 
never  have  been  tolerated  byLP90.    The  question 
then  becomes  what  can  you  live  with?    How  large 
an  error  are  you  willing  to  tolerate? 

R.  JACKSON:    That  question  of  what  you  can  live 
with  is  a  difficult  ore  and  makes  me  think  of  the 
comment  Chuck  Mylander  made  earlier  about  wanting 
to  keep  only  a  small  selection  of  codes.  There 
are  people  that  I've  done  work  for  who  want  only 
one  nonlinear  programming  code.    One  man  doesn't 
care  whether  there's  a  few  seconds  difference 
between  the  code  that  he's  got  and  the  other  codes 
that  he  might  be  able  to  use.    He  wants  only  to 
learn  how  to  operate  and  become  familiar  with  one 
code  rather  than  have  to  deal  with  a  large  number 
of  them. 

C.  MYLANDER:    As  a  referee,  I  still  get  alot  of 
papers  in  which  an  algorithm  is  proposed  but  the 
paper  doesn't  have  any  other  merit  than  the  pro- 
posed algorithm  and  it  hasn't  even  been  coded. 

H.  GREENBERG:    Why  don't  you  reject  it? 

C.  MYLANDER:     I  do.     It's  just  that  I  think  it 
would  be  a  good  idea  to  propose  a  standard  for 
people  who  publish  algorithms  and  computational 
results  making  it  mandatory  that  they  have  run 
the  code,  specified  some  test  problems,  and 
listed  the  environment  in  which  it  ran. 

H.  GREENBERG:    The  evolution  I  see,  for  example 
in  nonlinear  programming,  is  that  fifteen  or 
twenty  years  ago  we  were  pretty  hungry  for  al- 
gorithms making  it  possible  for  an  algorithm  to 
be  proposed  with  no  more  justification  other  than 
that  it  was  a  clever  new  idea.    As  time  went  on. 


however,  it  became  necessary  to  provide  empirical 
evidence,  or  theoretical  evidence  like  rate  analy- 
sis, about  the  efficiency  of  a  clever  new  idea. 
At  that  time,  say  in  the  sixties,  it  was  acceptable 
to  do  an  experiment  consisting  completely  of  a 
randomly  generated  problem.    In  the  seventies, 
however,  we're  discovering  that  is  no  longer  accep- 
table and  there  appears  to  be  a  move  toward  much 
more  sophisticated  design  of  experiments  to  test 
whether  an  assertion  of  a  superior  algorithm  is 
true. 

H.  CROWDER:    What's  the  possibility  of  sending 
programs  and  listings  to  referees  that  can  be 
used  in  refereeing  the  paper.    For  example,  I 
once  was  asked  to  referee  a  paper  for  TOMS.  The 
editor.  Milt  Gutterman,  sent  me  a  deck,  and  a 
listing,  and  the  third  example  I  tried  on  the  code 
failed.    I  simply  packed  it  all  up  and  sent  it 
back  to  the  editor  and  it's  back  in  the  hands  of 
the  author  now. 

R.  DEMBO:    When  I  originally  brought  the  topic 
up,  I  was  thinking  about  the  kind  of  article  that's 
published  in  which  the  authors  say  "Here  is  a  new 
algorithm  that  took  two  seconds  to  solve  Rosen- 
brock's  function  and  therefore  this  is  a  really 
neat  new  algorithm.    That  kind  of  article  is  still 
being  published,  for  example,  in  Mathematical 
Programming.    It  shouldn't  be.    We  are  now  at  the 
point  where  we  need  much  more  information  than 
that  it  took  two  seconds  to  solve  Rosenbrock's 
Function. 

UNKNOWN :    One  thing  I  haven't  heard  mentioned 
today  is  the  situation  where  an  algorithm  is  pro- 
posed to  solve  a  previously  unsolved  problem  or 
for  some  other  reason  there  are  no  other  algo- 
rithms to  compare  against  it.    My  question  is 
should  the  proposer  of  the  algorithms  be  required 
to  program  it?    There  may  be  a  number  of  math- 
ematicians who  simply  are  unwilling  to  do  that. 

H.  GREENBERG:    As  I  understand  the  spirit  of  the 
guidelines,  they're  based  on  the  assumption  that 
they  will  be  applied  only  for  papers  whose  primary 
contribution  is  based  on  a  claim  of  superiority 
or  some  other  way  related  to  competition  with 
other  algorithms  for  solving  problems.    Only  in 
that  circumstance  are  the  guidelines  applied  and 
the  authors  are  asked  to  satisfy  certain  experi- 
mental design  criteria.    Not  much  thought  has 
been  given  to  the  situation  you're  describing. 

R.  O'NEILL:    Before  the  session  ends,  I  would  like 
to  ask  the  audience  a  few  questions  about  some  of 
the  topics  we've  discussed  today.    I'd  like  to 
know,  for  example,  how  many  people  here  think  that 
the  computer  hardware  should  be  mentioned  or  spe- 
cifically included  in  a  paper  to  be  published 
about  computational  results?    What  about  the 
operating  system?    The  compiler?    The  specific 
test  problems  used?    Should  the  test  problems  be 
made  available?    I  think  we  have  a  consensus. 

H.  GREENBERG:    One  more  question.    How  many  care 
whether  the  publications  carry  that  kind  of 
informati  on? 

R.  DEMBO:    It  appears,  gentlemen,  that  we  have 
won. 
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R.  O'NEILL:    One  more  question  -  how  about  the 
code?    How  many  people  think  the  code  should  be 
made  available?    No  consensus.    What  about  avail- 
able, but  not  published?    No  consensus.  What 
about  available  to  the  referees  only,  for  evalua- 
tion?   A  consensus. 

M.  GUTTERMAN:    That's  a  suitable  compromise. 

R.  O'NEILL:    What  about  proprietary  codes?  Should 
they  be  made  available  to  the  referees  only  for 
evaluation?    No  consensus  on  that  topic. 

R.  DEHBO:    We've  asked  a  lot  of  questions  here  and 
got  some  very  good  input  from  the  audience.  The 
next  question  we  have  is  what  exactly  should  we  do 
with  it?    How  do  we  go  about  getting  these  ideas 
implemented? 

J.  FILLIBEN:    Isn't  that  a  problem  for  the  editor- 
ial boards? 

H.  GREENBERG:    Yes.    As  I  understand  it  we  should 
next  send  a  letter  to  Michelle  Balinski,  the  editor 
of  Mathematical  Programming,  with  copies  to  editors 
of  other  journals  of  our  field.    Since  we're  a 
Committee  of  the  Mathematical  Programming  Society, 
and  since  Balinski  has  already  indicated  a  great 
desire  to  have  us  produce  these  guidelines,  we  can 
expect  favorable  treatment. 

R.  JACKSON:    It's  time  now  to  sum  up  this  session, 
since  we  have  run  out  of  time.    I  would  like  to 
point  out  that  the  next  step  for  the  Committee  on 
Algorithms  is  to  draft  a  set  of  proposed  guidelines. 
We  plan  a  panel  discussion  at  the  San  Francisco 
meeting  of  ORSA  in  the  spring  of  next  year  where 
these  guidelines  will  be  discussed  by  seven  asso- 
ciate editors  of  the  journals  of  our  field.  After 
that,  a  final  version  of  the  guidelines  will  be 
sent  to  the  editors  of  those  journals.    The  commit- 
tee also  is  organizing  a  research  exchange  or  news- 
letter to  keep  informed  those  persons  who  are 
interested  in  the  work  of  the  committee.    We  will 
be  compiling  a  mailing  list  of  "friends  of  the 
committee",  and  if  anyone  is  interested  please 
send  your  name  and  address  to  any  member  of  the 
committee.    With  that,  we  can  end  the  session  by 
saying  thank  you  all  for  coming. 
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The  modeling  and  solution  of  large-scale  mathematical  program- 
ming systems  have  been  heavily  influenced  by  the  advent  of  large 
and  high  speed  digital  computers,   powerful  commercial  software  sys- 
tems for  mathematical  programming,   and  special  languages  for  matrix 
generation  and  report  writing,   and  finally,   by  the  increasing  com- 
plexity of  decision  making  in  the  business  world.     An  attempt  is 
made  here  to  show  how  modeling  and  solution  for  large-scale  busi- 
ness applications  is  approached  nowadays  from  a  total  system  view 
point  starting  from  the  problem  definition  and  including  the  design 
of  input  and  output  systems,   the  formulation  of  the  mathematical 
programming  model,   and  the  generation  of  financial  and  other  busi- 
ness reports  by  drawing  the  information  from  the  optimal  solution 
and  sensitivity  analyses.     A  large  linear  programming  application 
in  use  for  production  and  distribution  planning  vzill  be  used  for 
illustration . 
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ABSTRACT 

The  concept  of  search  enumeration  in  integer 
programming  is  applied  to  find  the  optimal 
schedule  in  a  plastic  injection  industrial 
process.     The  production  process  involves 
molding  parts  using  a  molding  machine  which 
usually  has  several  cavities.     Each  cavity 
accomodates  a  die  which  makes  a  single  part. 
Given  a  set  of  customer  orders  which  require 
certain  production  days   (known) ,   the  problem 
is  to  find  a  schedule  which  minimizes  the  num- 
ber  of  molding  machine  setups,  while  satisfy- 
ing all  technological  and  logistical  con- 
straints.    This  paper  presents  a  search 
enumeration  algorithm  to  find  an  optimal 
schedule.     The  algorithm  is  followed  by  an 
illustrative  example.     Some  computational 
results  are  mentioned,  and  other  comments 
concerning  algorithm  improvements  are  also 
given. 

All  work  discussed  here  relates  to  part  of  an 
actual  case  study  for  a  medium  size  Northeast 
Ohio  Corporation.     The  entire  case  study 
resulted  in  the  development  of  a  computer 
system  for  production  scheduling  as  well  as 
for  due  date  assignment,  inventory  control, 
machine  allocation,  and  extensive  data 
processing.     In  this  article  we  concentrate 
our  efforts  on  the  optimization  phase  and 
its  implementation  for  the  production 
scheduling  part  of  the  system. 


*No  part  of  this  document  may  be  reproduced  with- 
out explicit  written  permission  of  both  authors. 


1.  INTRODUCTION 

The  concept  of  search  enumeration  in  integer 
programming  is  applied  to  find  the  optimal 
schedule  in  a  plastic  injection  industrial 
process.     The  production  process  involves 
molding  parts  using  a  molding  machine  which 
usually  has  several  cavities.     Each  cavity 
accomodates  a  die  which  makes  a  single  part. 
Given  a  set  of  customer  orders  which  re- 
quire certain  production  days   (known) , 
the  problem  is  to  find  a  schedule  which 
minimizes  the  number  of  molding  machine 
setups,  while  satisfying  all  technological 
and  logistical  constraints.     This  paper 
presents  a  search  enumeration  algorithm  to 
find  an  optimal  schedule.     The  algorithm 
is  followed  by  an  illustrative  example. 
Some  computational  results  are  mentioned, 
and  other  comments  concerning  algorithm 
improvements  are  also  given. 

All  work  discussed  here  relates  to  part 
of  an  actual  case  study  for  a  medium 
size  Northeast  Ohio  Corporation.  The 
entire  case  study  resulted  in  the  develop- 
ment of  a  computer  system  for  production 
scheduling  as  well  as  for  due  date  assign- 
ment, inventory  control,  machine  allocation, 
and  extensive  data  processing.     In  this 
article  we  concentrate  our  efforts  on  the 
optimization  phase  and  its  implementation 
for  the  production  scheduling  part  of  the 
system. 

2.  A  BRIEF  DESCRIPTION  OF  THE  PRODUCTION 

PROCESS 

In  this  section,  we  describe  the  production 
process  which  involves  molding  plastic  parts 
via  a  molding  machine.     A  molding  machine 
accomodates  several  (usually  6  or  8)  dies, 
each  of  which  corresponds  to  a  specified 
part.     Dies  in  a  machine  may  be  all  different, 
but  certain  technological  restrictions  exist. 
Most  importantly,  these  are  die  position 
constraints,  and  die  length  constraints.  That 
is,  due  to  their  geometric  characteristics 
and  ease  of  removal  from  the  die,  certain 
parts  must  be  produced  by  dies  located  in  a 
mold  position  near  the  machine  operator.  These 
dies  are  said  to  require  a  "front  position" 
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in  the  mold,  and  are  labelled  "front  runners." 
In  contrast,  the  remaining  dies  can  have  ei- 
ther a  front  or  rear  position,  where  a  rear 
position  is  at  a  farther  distance  from  the  oper- 
ator.    In  addition  to  front  or  rear  die  posi- 
tions, quality  control  dictates  that  some 
dies  must  be  located  closer  to  the  center  of 
the  mold.     Also,  to  avoid  part  breakage 
during  their  removal  from  the  machine,  the 
difference  of  lengths  of  adjacent  dies  can 
be  no  more  than  3/4". 

It  is  undesirable  and  costly  to  stop  the 
molding  machine  during  a  production  run. 
Thus  the  molding  operation  is  basically  a 
continuous  production  process.     New  jobs 
to  be  processed  correspond  to  orders  for 
distinct  parts.     Based  on  an  order  quantity, 
the  number  of  production  days  required  is 
obtained.     At  the  end  of  production  for  a 
certain  job,  a  die  must  be  pulled  out  of  a 
cavity  in  the  molding  machine  and  a  new  die 
required  for  the  next  job  is  inserted. 
During  this  setup,  the  whole  molding  machine 
must  be  stopped,  which  means  that  production 
is  lost.     Although  a  setup  can  be  completed 
fairly  quickly,  say  in  30  minutes,  it  could 
take  several  hours  until  production  stabilizes 
and  acceptable  parts  are  molded.  Therefore, 
in  order  to  maximize  production,  it  is  desired 
to  minimize  the  number  of  setups. 

Even  though  our  actual  problem  involves  mul- 
tiple plants  and  multiple  machines,  as  well 
as  some  other  constraints  (such  as  mold  speed, 
plant  specification,  etc.),  in  order  to 
simplify  the  discussion,  we  concentrate  on 
a  one-machine  example.     We  also  assume  that 
orders  correspond  to  distinct  parts,  only  one 
die  is  available  for  each  order  and  the  sched- 
uling period  is  known.     (It  is  actually  de- 
termined from  a  previous  model  in  the  sys- 
tem.)    Our  goal  is  to  find  a  schedule,  which 
minimizes  the  number  of  setups,  while  satis- 
fying all  technological  constraints.  The 
next  section  describes  a  search  enumeration 
algorithm  which  finds  an  optimal  schedule. 

An  Illustrative  Example  (1  machine,   11  orders) 


Order 

Production 

Die 

Die  Length 

Number 

Days 

Position 

(inch) 

1 

2 

Front 

3.6 

2 

5 

Rear 

2.5 

3 

3 

Front 

3.2 

4 

7 

Rear 

2.8 

5 

3 

Front 

3.6 

6 

1 

Front 

3.  3 

7 

3 

Front 

2.9 

8 

1 

Rear 

2.8 

9 

3 

Front 

3.0 

10 

4 

Rear 

3.0 

11 

4 

Rear 

2.7 

The  initial  conditions  of  the  machine  schedule 
(usually  due  to  the  previous  schedule)  and 
an  arbitrary  schedule  which  satisfies  the 
technological  constraints  is  shown  in  Figure  1. 
In  the  figure,  the  shaded  area  indicates  the 
initial  conditions  and  the  numbers  in  the 
shaded  blocks  are  the  die  lengths.  The 


scheduling  period  is  assumed  to  be  5  days, 
and  the  search  algorithm  discussed  in  the  next 
section  seeks  an  optimal  schedule  which  fills 
up  this  scheduling  period  with  the  smallest 
number  of  additional  setups. 


Die  Position 
Front  Rear 


SSi:        1  2  3  4  5  6  7  8 


*Front  runner 
<^Additional  setups 


FIGURE  1.  Initial  Conditions   (shaded  area)  6. 
an  Arbitrary  Schedule  for  the 
Illustrative  Example 

3.     FORMULATION  OF  THE  PROBLEM 

The  algorithm  is  similar  to  a  branch  and  bound 
approach  used  in  scheduling  theory  (e.g.,  see 
Baker  [1]),  but  the  nature  of  the  problem,  as 
explained  shortly,  suggests  a  search  enumera- 
tion used  in  integer  programming  (e.g.,  see 
Salkin  [2]) .*    As  in  any  enumeration  algorithm. 


*A  search  enumeration,  as  opposed  to  a  branch 
and  bound  enumeration,  was  used  because  of  the 
following : 

1)  A  branch  and  bound  scheme  tends  to  create 
many  dangling  nodes.     That  is,  those  that 
have  not  yet  been  considered  for  branching 
purposes.     This  may  cause  computer  stor- 
age difficulties;  whereas,  a  search  enu- 
meration deals  with  only  one  node  at  a 
time,  at  the  cost  of  more  bookkeeping,  and 
does  not  require  the  storage  of  a  set  of 
dangling  nodes  (see,  Salkin  [2]). 

Saving  computer  storage,  was  especially 
important  to  us  because  the  system  is 
being  implemented  on  a  small  computer 
(64K  bytes  of  main  core) . 

2)  A  clever  bookkeeping  scheme,  based  on 
the  fact  that  a  forward  step  corresponds 
to  locating  a  die  in  a  position  with  an 
earliest  calendar  opening,  improves  the 
performance  of  a  search  enumeration  and 
uses  a  very  minimal  amount  of  storage. 

3)  A  search  enumeration  always  gives  a 
current  best  feasible  schedule  and 
usually    produces  a  near  optimal  sched- 
ule very  quickly  (see,  Salkin  [2]). 
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the  process  involves  branching  and  bounding, 
and  can  be  plctorially  represented  by  a  tree 
consisting  of  nodes  and  branches.     The  nodes 
correspond  to  subproblems  and  the  branches 
link  subproblems  which  differ  by  a  single 
additional  job  fixed  in  a  die  position.  We 
now  discuss  the  procedure  in  context  of  an 
enumeration  tree. 

Let  p'^,  the  initial  node  in  the  tree  (see 
Figure  2) ,  denote  the  problem  containing  n 
,0 


jobs. 


The  problem  P     can  be  partitioned 
1       1  1 


P    by  as- 
n 


into  n  subproblems,  Pj,  ^2'- 

signing  jobs  to  the  first  opening  (i.e., 

subsequent  to  forward  steps) .     By  "first 

opening"  we  mean  the  die  position  which  has 

the  earliest  calendar  opening.     In  case  of 

ties,  we  can  use  any  consistent  rule  such 

1 

as  the  smallest  index.     Thus,  Pj  is  the  ori- 
ginal problem  with  job  1  fixed  at  the  ear- 
1 

liest  opening;  P2  is  the  original  problem 

with  job  2  fixed  at  the  earliest  opening 
etc.     In  the  Illustrative  Example,  the  ear- 
liest opening  is  day  1  at  positions  1  and 
1 

5.  Pi  is  then  the  original  problem  with  job 
1  assigned  at  position  1.  (We  are  using  the 
smallest  index  rule  in  case  of  ties.)  Next, 

each  of  the  subproblems  can  be  further  par- 

1 

titioned  as  in  Figure  2.     For  instance,  P2 


2  2 

can  be  partitioned  into  P2i»  ^2  3' 


2 

P2n. 


For 


example,  in  P2i>  jobs  2  and  1  are  assigned 
at  the  first  two  earliest  openings  in  this 
order.     In  general,  at  level  k,  each  sub- 
problem  contains  k  jobs  already  scheduled 
(or  fixed  in  a  die  position).     Each  sub- 
problem  can  be  further  partitioned  into  "at 
most"  (n-k)  subproblems,  which  form  part  of 
the  level  (k+1)  subproblems.     The  reason 
why  we  say  "at  most"  is  because  of  the  tech- 
nological constraints  mentioned  earlier, 
which  may  limit  the  production  of  some  parts 
at  certain  die  positions.     For  example, 

2 

problem  P21  does  not  exist  in  the  Illustrative 
Example  as  job  2,  which  is  a  front  runner, 
cannot  be  placed  in  die  position  5  which  is 
a  rear  die  position.     This  type  of  implicit 
enumeration  may  also  be  the  result  of  the  die 
length  constraint. 

We  go  down  the  tree  until  a  schedule  is 
completed  and  a  solution  is  found.     Then  a 
backward  step  (a  return  to  the  previous  sub- 
problem)  is  taken  and  a  regular  search  pro- 
cedure starts.     The  detail  of  a  search 
enumeration  can  be  found  in  Salkin  [2].  A 
part  of  a  search  tree  for  the  Illustrative 
Example  is  given  in  the  next  section  (Figure 
3). 


Level  0 


Level  1 


Level  2 


FIGURE  2.     The  General  Enumeration  Tree 


4.     THE  ALGORITHM 

Notation    We  denote  the  set  S  as  a  partial  se- 
quence of  jobs  from  among  the  n  jobs  origi- 
nally in  the  problem.     For  example,  S  = 
(2,4,3,1),  means  that  job  2  is  scheduled  at 
the  earliest  opening,  and  job  4  at  the  2^^"^ 
earliest  opening,  job  3  at  the  3^*^  earliest 
opening,  and  job  1  at  the  4^^  earliest  opening. 
Also,  S7  =  (2,4,3,1)7  means  S  =  (2,4,3,1,7), 
or  that  job  7  is  scheduled  at  the  5*^^  ear- 
liest opening. 

Algorithm  Listing 

Step  1  (Initialization) 

Set  Z*,  the  current  smallest  number  of  set- 
ups to  an  arbitrarily  large  value.     Set  k=0. 

The  problem  is  P°  and  S  is  null.     Go  to  Step  2. 

Step  2 

Find  the  (k+l)^*^  earliest  opening  die  posi- 
tion. In  case  of  ties,  select  the  smallest 
index.     Go  to  Step  3. 

Step  3  (Forward  Step) 

At  the  (k+l)st  earliest  opening  die  position, 
try  to  schedule  any  one,  say  order  i,  of 
"untested"  orders  (i.e.,  exclude  orders  al- 
ready tested  or  scheduled)  which  satisfy  the  , 
die  position  (i.e.,  front/center/ rear) 
and  die  length  constraints.  J 

k+1 

This  defines  problem  P^^  . 

(A)  If  there  is  no  such  order,  go  to  Step  4 
(Backward  Step) .     Count  the  total  num- 
ber of  setups  thus  far,  denoted  as  Z. 

(B)  If  Z  >_  Z*,  mark  the  order  as  "tested" 
(at  the  current  earliest  opening  die 
position),  and  repeat  the  Forward  Step 
with  another  "untested"  order. 

(C)  If  Z  <  Z*  and  the  schedule  is  completed, 
set  Z*=Z  (improved  schedule  found) . 

Set  k=k+l  and  S=Si.  Go  to  Step  ^-  (Back- 
ward Step) . 
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FIGURE  3.     Part  of  the  Search  Enumeration  Tree  for  the  Illustrative  Example 
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(D)  If  Z  <  Z*  and  the  schedule  is  not  yet 
completed  (Forward  Step)  ,  set  k=k+l 
and  S=Si.     Go  to  Step  2. 

In  any  case,  an  order  is  never  scheduled  if 
its  production  duration  exceeds  the  time  the 
particular  molding  machine  is  expected  to 
remain  operable.     (This  is  determined  from 
a  different  model,  not  discussed  in  this  ar- 
ticle.) 

Step  4  (Backward  Step) 

Back  up  to  the  last  scheduled  order,  and  mark 
the  order  scheduled  at  the  die  position  being 
opened  as  "tested."     In  other  words,  if 
S=S'i  (i.e.,   the  last  scheduled  order  is 
order  i) ,  remove  order  i  from  the  current 
schedule  and  return  to  problem  PgT''''  Set 

k=k-l,  and  go  to  Step  5. 

Step  5  (Termination  Test) 

If  k  <  0,  stop;  Z*  is  the  minimum  number  of 
setups.     Otherwise,  count  the  number  of  set- 
ups, Z,  for  the  current  schedule  (i.e.,  S'). 
If  Z  _>  Z*,  go  to  Step  4  and  take  another 
Backward  Step.     If  Z  <  Z*  (more  precisely, 
if  Z  =  Z*-l),  go  to  Step  3. 


5.     AN  ILLUSTRATIVE  EXAMPLE 

We  now  describe  how  the  algorithm  works 
using  the  Illustrative  Example,  and  assume 
that  initially  an  arbitrary  schedule  as  shown 
in  Figure  1  is  given.     This  schedule  is  equi- 
valent to  P^^,  where  S  =  (1,2,3,4,5,6,7,8,9, 
10,11),  with  six  (6)  additional  setups. 
(Note  that  setups  due  to  the  previous  schedule 
are  not  counted.)    Therefore,  Z*=6,  and  the 
steps  below  follow.     The  tree  corresponding 
to  part  of  the  computation  is  in  Figure  3. 

Step  4   (Backward  Step) 

Remove  order  11  from  the  current  schedule. 
Mark  order  11  scheduled  at  the  11^^  earliest 
opening  die  position  (i.e.,  die  position  7 
at  day  4)  as  "tested."     Set  k=ll-l-10. 

Step  5  (Termination  Test) 

k  -j;  0  and  Z=6=Z*=6.     Therefore,  go  to  Step 
4  and  take  one  more  Backward  Step. 

Step  4  (Backward  Step) 

Remove  order  10  from  the  current  schedule. 
Mark  order  10  scheduled  at  the  lO^h  earliest 
opening  die  position  (i.e.,  die  position  4 
at  day  4)  as  "tested."    Set  k=10-l=9. 

Step  5  (Termination  Test) 

k  -j:  0  and  Z=6=Z*=6. 

Step  4  (Backward  Step) 

Remove  order  9  from  the  current  schedule. 
Mark  order  9  scheduled  at  the 

gth 

earliest 

opening  die  position  (i.e.,  position  2  at 
day  4)  as  "tested."    Set  k=9-l=8. 

Step  5  (Termination  Test) 

k  ■):  0  and  Z=5?^Z*=6. 


Step  3(D)    (Forward  Step) 

Schedule  order  10  at  the  9*-^  earliest  opening 
die  position.     Z=5  <  Z*=6,  and  k=8+l=9,  and 
S  =  (1,2,3,. ..,8,10). 

Step  2 

The  (k+l)^"^  =  10*^^  earliest  opening  die  posi- 
tion is  die  position  4  at  day  4. 

Step  3(D)   (Forward  Step) 

Schedule  order  11  at  the  10'^^  earliest  opening 
position.  Z=5  <  Z*=6,  and  k=9+l=10,  and 
S  =  (1,2, 3,. ..,8, 10, 11). 

Step  2 

The  11th  earliest  opening  die  position  is 
die  position  7  at  day  4. 

Step  3(B)    (Forward  Step) 

Order  9,  front  runner,  cannot  be  scheduled 
at  the  rear,  and  there  is  no  other  alterna- 
tive. 

Step  4   (Backward  Step) 

Remove  order  11  from  the  current  schedule. 
Mark  order  11  scheduled  at  the  10^^  earliest 
opening  as  "tested."     Set  k=10-l=9  and 
S  =  (1,2,3,. ..,8,10). 

Step  5  (Termination  Test) 

k  •(:  0  and  Z=5=Z*-1. 

Step  3(B)   (Forward  Step) 

Order  9  scheduled  at  the  IQth  earliest  opening 
will  make  Z=6=Z*,  and  there  is  no  other 
"untested"  order  at  this  opening. 

Step  4  (Backward  Step) 

Remove  order  10  from  the  current  schedule. 
Mark  order  10  scheduled  at  the  9th  earliest 
opening  as  "tested."     Set  k=9-l=8  and 
S  =  (1,2,3,...,8). 

Step  5  (Termination  Test) 

k  -j:  0  and  Z=5=Z*-1. 

Then  similar  steps  repeat  by  scheduling  order 
11  at  the  earliest  opening  without  pro- 

ducing a  better  schedule,  and  evenutally  the 
enumeration  reverts  to  level  8,  level  7, 
etc.     A  part  of  the  enumeration  tree  corres- 
ponding to  the  above  description  is  given  in 
Figure  3.     After  we  check  all  alternatives 
at  the  l^t  opening  die  position,  we  have 
the  optimal  solution,  which  is  shown  in 
Figure  4.     Notice  that  Z*=3. 


Die  Position 
Front  B£ar 
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6.     COMPUTATIONAL  EXPERIENCE 

The  algorithm  has  been  coded  in  FORTRAN 
and  is  being  implemented  on  a  small  IBM 
computer  with  64K  bytes  of  main  core. 
A  few  smaller  problems  have  thus  far  been 
tested  successfully. 

Additional  computational  behavior  corre- 
sponding to  the  Illustrative  Example  (Section 
5)  is  in  Table  1.     Running  times  relate  to 
test  runs  on  a  UNIVAC  1108  computer. 

TABLE  1 

Computational  Performance  for  Example  1 


Time  to  reach  level 

Current  smallest 

Level  k 

k  first  time  after 

number  of  setups 

a  Backward  Step  (se- 

conds/Univac  1108) 

11 

0.000 

6 

10 

0.001 

6 

9 

0.005 

6 

8 

0.011 

6 

7 

0.015 

6 

6 

0.030 

5 

5 

0.034 

5 

4 

0.060 

5 

3 

0. 147 

4 

2 

1.290 

4 

1 

1.509 

3* 

0  (Opti- 

5.984 

3 

mality) 

*The  optimal  solution  with  3  setups  is  ob- 
tained after  1.404  seconds 


The  algorithm  can  easily  be  extended  to  the 
multiple  plant  and  multiple  molding  machine  case. 
The  actual  case  study  involves  many  molding 
machines  located  at  several  plants.  Each 
molding  machine  corresponds  to  a  particular 
mold  type  and  to  a  particular  material*  and  so 
the  algorithm  may  be  applied  directly  to  all 
molding  machines  with  the  same  mold/material 
type.     The  multiple  plants  normally  add 
certain  technological,  logistical,  and/or 
quality  control  constraints.     The  resulting 
plant  specif ication(s)  naturally  contributes 
to  the  algorithm's  efficiency. 


In  order  to  apply  this  approach,  a  planning  hori- 
zon has  to  be  introduced,  where  the  idea  is  to 
schedule  the  machines  during  a  given  planning  hor- 
izon.    An  order  is  scheduled  only  when  it  can 
start  during  the  planning  horizon.     With  this  ap- 
proach, the  algorithm,  during  forward  steps,  sche- 
dules the  complete  planning  horizon  of  the  first 
machine,  and  then  goes  to  the  second  machine,  and 
so  on.     The  backward  step,  works  in  an  exactly  op- 
posite way.     If  this  approach  is  adopted,  then  we 
are  minimizing  the  number  of  setups  which  result 
from  schedules  that  fill  up  the  planning  horizon. 
Therefore,  it  is  possible  that  certain  orders  can- 
not be  scheduled  (or,  more  precisely,  cannot  be 
started)  during  the  planning  horizon  in  the  opti- 
mal schedule  (e.g.,  see  Figure  6).     At  this  time, 
the  program  uses  the  latter  approach. 

It  should  also  be  mentioned  that  the  algorithm 
currently  does  not  allow  for  "a  hole"  in  a  sche- 
dule.    In  other  words,  in  a  specific  cavity  of  a 
molding  machine,  two  subsequent  orders  must  be 
scheduled  in  such  a  way  that  when  the  first  one  is 
finished,  the  other  has  to  be  started  immediately 
subsequent  to  a  set  up.     It  is  conceivable  that  by 
allowing  a  gap  in  a  schedule  (in  practice,  a  cavi- 
ty is  blocked  in  this  case)  the  number  of  setups 
can  be  reduced.     Computations  with  a  sample  pro- 
blem is  below.     In  this  example,  we  have  2  mach- 
ines at  two  diffement  locations,  A  and  B.     Due  to 
certain  logistical  and  other  constraints  not  men- 
tioned, it  turns  out  that  some  orders  must  be  pro- 
duced at  a  specific  plant,  and  thus  we  have  a 
plant  specification.     Also,  the  two  machines,  de- 
noted by  X  and  Y  have  slightly  different  capabili- 
ties, and  some  orders  must  be  processed  by  a  par- 
ticular machine  (machine  specification).     If  no 
specification  is  made,  any  order  can  be  processed 
by  any  machine.     Machine  X  is  located  at  Plant  A, 
and  Machine  Y  at  Plant  B.     Initial  conditions  of 
the  machine  schedule  and  an  initial  arbitrary 
schedule  is  given  in  Figure  5.     The  optimal  sche- 
dule for  the  Example,  found  by  the  search  enumera- 
tion, is  shown  in  Figure  6.     Notice  that  orders 
6  and  8  are  not  scheduled  during  the  current  plan- 
ning horizon. 

EXAMPLE 

(2  plants,  2  machines,  14  orders) 


There  are  two  different  approaches  when 
extending  the  algorithm  to  the  multiple 
machine  and  multiple  plant  case.     One  approach 
lays  out  all  machines  in  parallel  and  ap- 
plies the  algorithm  directly.     If  we  consider 
a  two  machine  problem,  each  machine  with 
eight  cavities,  the  approach  reduces  to 
considering  a  one  machine  problem  with  six- 
teen cavities.     Of  course,  the  way  to  count 
the  number  of  setups  has  to  be  modified, 
because  a  setup  in  one  machine  is  independent 
of  a  setup  in  the  other.     On  the  other  hand, 
a  second  approach  to  the  multiple  machine 
problem  is  to  lay  out  all  machines  in  series. 


*The  material  specification  is  suggested  in 
a  previous  model  which  is  not  discussed  in 
this  article.     The  final  material  specifica- 
tion is  given  by  management . 


Produc- 


Order 

tion 

Plant 

Machine 

Die  Po- 

Die 

No. 

Days 

Spec . 

Spec. 

sition 

Length 

1 

6 

A 

X 

Front 

3.0 

2 

6 

A 

X 

Rear 

2.5 

3 

3 

Rear 

2.4 

4 

1 

Front 

3.0 

5 

6 

X 

Front 

3.0 

6 

2 

A 

Front 

2.8 

7 

2 

Rear 

2.6 

8 

4 

B 

Front 

3.3 

9 

19 

B 

Front 

3.0 

10 

10 

Rear 

2.3 

11 

2 

Rear 

3.0 

12 

4 

Rear 

2.9 

13 

7 

Rear 

2.8 

14 

5 

Front 

3.5 
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7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 


/ 


btachlne  X  (Plane  A) 
Dies 

12      3^56  7 


1* 


3.3^ 
1^ 


3.0 


5* 


*  Front  runners 


v. 

''/ 

2.8 

// 

2 

/ 

■A 

2.8 

// 

7 

-  - 
10 

! 

i 

// 

Machine  Y  {Plant  B) 
Dies 

1      2      3      A      5      6      7  8 


3.6 


'A 


8* 


11* 


3.4 


9* 


i2.9. 


7  /  ^'  /  1 

5    Q    -3    A  ■' 


11 


13 


3.0 


12 


2.5 


K2.'8 


FIGURE  5.     Initial  Conditions  (shaded  area)  and  an  Arbitrary  Schedule  for  the  Example 


<3 

<|]  period. 


The  end  of  the 
scheduling 


Machine  X  (Plant  A) 
Dies 

1      2      3      4      5      6      7  8 


<] 
<3 


(Orders  6  and  8  not  scheduled) 

FIGURE  6.     Optimal  Schedule  for  the  Example 
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The  model  is  being  tested  for  problems  with  5 
plants  and  about  30  molding  machines.  Fortunate- 
ly, as  there  are  several  material  types  and  mold 
types,  it  can  be  subdivided  into  smaller  problems. 

Even  though  the  current  model  assumes  determinis- 
tic production  days,  in  reality,  some  orders  have 
a  range  of  production  days  such  as  either  7  or  8 
days.     This  allows  more  flexibility  in  the  sche- 
duling phase,  and  may  further  reduce  the  number 
of  setups.     The  model  has  been  expanded  to  in- 
corporate variable  production  days.     (This  is 
examined  at  each  step  in  the  enumeration  and  thus 
the  schedules  may  not  be  optimal  in  a  global 
sense . ) 

In  general,   algorithm  trys  at  most  nl  combina- 
tions, where  n  is  the  number  of  orders.  However, 
in  computations,   implicit  enumeration  substanti- 
ally reduced  the  number  of  combinations  that  must 
be  examined.  Judging  from  the  fact  that  a  fairly 
good  schedule  is  often  obtained  early  in  the  enu- 
merative  process  and  also  that  a  substantial 
amount  of  computations  is  usually  spent  to  show 
that  the  current  best  solution  is,   in  fact,  op- 
timal, heuristics  which  will  curtail  the  compu- 
tations are  now  being  considered.     For  example, 
we  can  restart  the  algorithm  by  changing  the 
earlier  part  of  the  schedule  drastically  when 
the  computational  progress  slows,  and  eventually 
terminate  all  computations. 


7.  CONCLUSION 

The  problem-  of  finding  an  optimal  schedule  which 
minimizes  the  number    of  setups  in  a  plastic 
molding  operation  is  formulated  and  solved  by 
search  enumeration.     Even  though  the  size    of  the 
problems  tested  so  far  are  relatively  small  (at 
most  15  orders),   the  original  problem  combined 
with  all  technological,  logistical,  and  other 
constraints  yields  several  smaller  problems. 
Each  of  these  with  a  clever  heuristic  rule, 
that  generates  a  good,   if  not  optimal,  schedule 
within  a  reasonable  amount  of  computer  time, 
should  be  solved  quickly  using  the  algorithm. 
It  is  currently  in  an  extensive  testing  stage 
for  use  as  a  subroutine  in  a  production  sche- 
duling and  inventory  control  computer  system. 
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AN  IMS-GAMMA  3 
DATABASE  EDITOR 


E.  B.  Brunner 
Gulf  Oil  Corporation 


INTRODUCTION 

As  personnel  from  all  levels  of  today's 
industries  are  becoming  more  involved  with  the 
"man -computer  dialogue",  we  see  the  phenomenon 
occurring  that  more  non-computer  trained  people 
are  utilizing  its  facilities.     For  many  years  now, 
the  shift  has  been  away  from  having  highly 
trained  computer  people  provide  the  services  to 
execute  a  program.     Increasingly,  the  developed 
program  is  turned  over  to  the  user  to  allow  data 
tabulation  and  execution  to  be  done  by  him.  This, 
in  turn,  has  caused  a  need  for  an  easy  method  for 
the  user  to  do  this  data  tabulation  and  program 
exe  cution . 

One  feature  that  can  help  to  accomplish 
this  is  the  use  of  a  datbase.    A  database  being 
defined  as  "a  collection  of  interrelated  data  stored 
together  with  controlled  redundancy  to  serve  one 
or  more  applications  in  an  optimal  fashion;  the 
data  are  stored  so  that  they  are  independent  of 
programs  which  use  the  data;  a  common  and  con- 
trolled approach  is  used  in  adding  new  data  and 
modifying  and  retrieving  existing  data  within  the 
data  base.  "■* 

Utilizing  this  concept  with  its  "common  and 
controlled  approach  to  add  and  modify  data"  we 
can  devise  programs  by  which  a  user  can  tabulate 
data  and  execute  his  program  easily. 

Furthermore,  this  man-machine  interface 
should  fulfill  the  following  objectives: 

1.  Meet  the  needs  of  the  user. 

2.  Under standability  of  results. 

3.  Reliability  of  execution. 

4.  Ease  of  use. 

5.  Timely  results. 

EVOLUTION  OF  A  PROJECT 

Early  in  1973,  a  request  was  made  by  the 
Gulf  Oil  Trading  Company  for  an  analytical  tool 
(i.  e.  ,   L.  P.  model)  that  would  determine  the 
relative  value  of  crude  oils  for  different  refinery 
situations.    The  model  was  developed  in  card  deck 
form  using  the  matrix  generator  -  report  writer 
product  of  Bonner  &  Moore  Software  Systems, 


GAMMA  3.    The  model  preformed  very 
satisfactorily  and  gave  clearly  understood  results., 
The  familiar  complaint  then  came  from  the  cus- 
tomer that  the  mechanics  of  tabulating  the  infor- 
mation on  keypunch  forms,  waiting  for  and 
checking  the  keypunching  and  then  submitting  the 
deck  was  too  time  consuming  and  prohibited  them 
from  obtaining  the  needed  results  within  a  half- 
day  or  less.    What  could  we  do  about  this?  The 
solution  to  the  problem  seemed  to  be  some  form 
of  interactive  setup  to  eliminate  the  need  for  the 
card  deck.    This  requirement  turned  out  to  be  two- 
fold, a  method  or  program  to  create  and  maintain 
an  input  database  and  then  a  way  of  submitting  a 
job  stream.     For  their  purposes  we  had  met 
items  numbered  1,   2  and  3  of  our  objectives  list; 
but  as  yet,  had  not  addressed  items  4  and  5. 

Being  that  IBM's  Information  Management 
System  (IMS)  was  already  resident  at  Gulf  and 
provided  the  needed  database  capabilities  via  a 
Cathode  Ray  Tube  or  batch  program,  and  that  the 
project  sponsor's  analysts  were  familiar  with 
using  the  CRT  for  financial  work,  it  was  decided 
to  use  IMS  to  meet  these  requirements. 

THE  DATABASE  EDITOR 

The  database  and  its  programs  were  to 
handle  data  in  either  the  table  or  list  format 
utilized  in  GAMMA  3,  and  in  our  design  each  table 
and  list  was  to  be  a  database  record.    Figure  1  is 
an  example  of  a  GAMMA  3  table  and  list  in  card 
image  form.     The  table  in  database  record 
structure  would  appear  as  in  Figure  2. 

Further,  since  it  was  projected  that  many 
models  would  probably  use  this  database,  a  two- 
character  prefix  was  defined  to  be  used  as  a  means 
of  distinguishing  unique  sets  of  tables  and  lists. 
Each  table/list  name  would  be  prefixed  by  these 
two  characters  to  reserve  it  for  a  particular  user. 
Additionally,  password  security  was  to  be  made 
available  for  each  user-chosen  prefix.    The  IMS 
non -conversational  program  to  manipulate  these 
records  was  written  PL]  and  called  GAMED. 

^Martin,  James,   "Principles  of  Data-base 
Management,  "  Prentice  Hall  Inc. 
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Presently,  it  is  running  on  an  IBM  370/158  under 
VS  and  is  accessible  through  a  3270  CRT  or 
equivalent. 

As  displayed  on  the  next  two  figures  (3  and 
4),  GAMED  provides  capabilities  to  display  and 
edit  database  tables  and  lists.    For  display  pur- 
poses the  user  can  show: 

1.  A  table  with  its  text  and /or  data. 

2.  A  list  or  list  with  text. 

3.  A  table  or  list  row  names. 

4.  Table  column  names. 

5.  A  tabulation  of  tables  and  lists 
in  a  database  set. 

For  editing  the  user  can: 

1 .  Modify 

A.  Table/list  text 

B.  Table  data 

C.  Row/column  maximum  counts 

2.  Add 

A.  Complete  tables /lists 

B.  Table  rows  and/or  columns 

C.  List  rows 

3.  Delete 

A.  Complete  tables/lists 

B.  Table  rows  and/or  columns 

C.  List  rows 

D.  Entire  database  set 

4.  Duplicate 

A.  Individual  tables /lists 

B.  Entire  database  set 

5.  Replace  Row  and  Column  Names 

Since  the  structure  of  a  table/list  seems  to 
have  the  two  distinct  parts  of  text  and  data,  displays 
are  made  with  the  table/list  text  formatted  on  a 
screen  in  a  separate  manner  from  table  data. 
( Figures  5  and  6.  ) 

Upon  initialization  of  the  transaction  GAMED, 
the  user  obtains  the  display  in  Figure  7.  As 
explained  in  the  directions,  the  user's  two- 
character  prefix  is  to  be  placed  within  the 
parentheses,  the  table/list  name  that  is  to  be  acted 
upon  replaces  the  word  NAME,  the  field  with  the 
word  TEXT  specifies  the  current  format  in  use, 
the  three  letter  designator  following  the  character 
string  'FUNC:'  is  the  function  field  (SEE),  the 
space  between  the  words  SEE  and  PAGE  is  the 
options  fields,  and  the  PAGE  clause  on  the  right 
has  indicators  for  multiple  page  messages. 

If  we  use  the  prefix  (EB),  table  name 
SAMPLE,  function  SEE  (Figure  8)  and  depress  the 
ENTER  key,  the  resultant  display  will  look  like 


Figures  9  and  10.     Note  that  this  is  a  database 
display  of  the  GAMMA  3  table  listed  previously 
and  Figure  9  is  the  text  portion  and  Figure  10  is 
the  data. 

If  we  would  like  to  modify  some  data  entry 
on  Figure  10,  the  function  is  changed  to  MOD,  and 
we  move  the  cursor  down  to  the  value  to  be 
modified  and  change  it.    In  this  case,  it  is  the 
value  at  the  intersection  of  row  R2  and  column  C3 
(Figure  11).    Then  depress  the  ENTER  key  and 
receive  the  response  in  Figure  12. 

Once  GAMED  was  completed,  there  still 
remained  the  problem  of  having  the  GAMMA  3 
model  generator  access  these  database  tables  and 
lists.     This  was  accomplished  by  two  additional 
programs,  called  GAMLOAD  and  GAMUNLD. 
GAMLOAD  will  take  a  specially  formatted  file  of 
tables  and  lists  created  by  GAMMA  and  load  them 
onto  the  database  under  a  specified  prefix, 
whereas  GAMUNLD  will  unload  all  or  selected 
tables  of  one  or  more  prefixes  and  create  this 
specially  formatted  file  to  be  readily  useable  by 
GAMMA.     In  addition,  another  utility  program 
exists  to  list  and/or  punch  this  file  in  card  image 
form. 

After  these  programs  were  completed,  the 
first  part  of  the  original  two-fold  user  problem 
was  solved;  now  programs  existed  to  maintain  the 
input  database  and  make  it  available  to  the  model. 
This  eliminated  any  need  for  data  cards  since  the 
analyst  running  the  model  could  enter  the  data  via 
the  CRT. 

While  the  effort  to  produce  GAMED  was 
taking  place,  a  second  IMS  program  called 
JOBEDIT  was  developed.    Its  purpose  was  to 
maintain  and  edit  a  database  containing  jobstream 
card  images.    The  database,  called  JOBFILE, 
was  separated  into  members  designated  by  an 
eight-character  name.    As  an  added  feature,  any 
of  these  members  that  began  with  a  job  carci 
could  be  submitted  as  a  job  stream  with  a  RUN 
command,     (Example,   Figure  13.)    This,  of 
course,  satisfied  our  second  user  requirement. 

The  customer  was  then  provided  with  a 
member  on  JOBFILE  that  contained  a  job  stream 
to  execute  the  model.     After  the  data  tables  and 
lists  were  updated  for  the  current  problem,  the 
user  would  RUN  his  job  stream  and  get  results 
within  the  desired  time  frame  of  a  half-day  or  less. 

We  now  have  17  production  L,   P.  models 
and  three  non-L.  P,  model  applications  being  run 
in  this  manner  by  users  at  various  geographical 
locations  throughout  the  country.     This  type  of 
operation  has  been  so  successful  that  we  now 
receive  requests  from  our  user  companies  that 
stipulate  that  their  models  must  be  developed  to  be 
run  from  an  IMS  terminal. 

Finally,  I  would  like  to  list  some  other  IMS 
capabilities  that  we  have  added  to  our  operations. 
As  part  of  a  run,  the  user  can  optionally: 

1,  Receive  reports  at  a  CRT  and  have 
these  available  for  recall. 
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2.  Receive  reports  as  hard  copy  at  a 
local  printer. 

3.  Load  onto  the  database,  tables/ 
lists  created  during  execution. 

Figure  14  shows  a  diagram  of  the  overall 
operation  from  the  interaction  of  GAMED  to  the  un- 
load optimize  and  reload  cycle,  thereby  given  what 
we  feel  is  a  complete  and  unified  system. 
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Abstract 

Mathematical  programming  is  tncreasingly  being 
combined  with  other  mathematical  modeling  techni- 
ques.   A  combination  which  has  proved  particularly 
fruitful  in  the  field  of  energy  modeling  is  that  of 
an  econometric  model  and  a  linear  program.  Some 
practical  experience  has  shown  that  although  such 
combined  models  give  rise  to  unexpectedly  few 
mathematical  difficulties  during  solution,  the  com- 
puter software  which  is  currently  available  is  very 
inconvenient  for  implementing  these  models.  This 
paper  analyses  the  sources  of  these  difficulties, 
and  describes  some  new  software  tools  which  are 
being  developed  to  aid  in  the  construction  of  com- 
bined modeling  systems. 

1.    Introduction  -  The  State  of  the  Union 

Recently,  mathematical  models  which  have  as 
major  components  both  an  econometric  part  and  a 
linear  programming  (LP)  part  have  come  into  vogue. 
Hogan  [1]  has  described  how  such  a  model  was  con- 
structed and  solved  at  the  Federal  Energy  Adminis- 
tration for  the  Project  Independence  Evaluation 
System.    Jorgenson  [2]  has  proposed  a  combination 
of  the  Brookhaven  Energy  LP  model  and  the  DRI 
interindustry  econometric  model.    Shapiro  [3] 
describes  these  and  similar  models,  and  discusses 
mathematical  approaches  to  their  solution.  Cur- 
rently such  solution  methods  require  iteration 
between  the  econometric  model  and  the  LP;  the 
econometric  model  provides  supply  and  demand  data, 
and  the  LP  returns  an  optimal  distribution  of 
commodities  to  meet  the  demand  and  the  associated 
shadow  prices.    The  process  terminates  when  an 
equilibrium  set  of  prices  is  attained.    This  paper 
addresses  the  question  of  what  software  tools 
are  needed  by  a  person  who  wishes  to  develop  and 
implement  an  algorithm  of  this  nature  with  the 
minimum  computer  programming  effort. 

It  will  be  useful  at  the  outset  to  distin- 
guish between  two  separate  but  connected  aspects 
of  the  problem  of  combining  LP  and  econometrics  on 
the  computer.    One  aspect  is  that  of  data  base 
management;  that  is,  the  marshalling  of  the 
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original  raw  data  into  a  form  suitable  for  computa- 
tion, and  the  production  of  meaningful  output.  The 
other  is  that  of  system  integration;  that  is,  the 
provision  of  mechanisms  to  control  the  execution  of 
the  processes,  and  to  facilitate  transfer  of  data 
across  boundaries  between  systems.    This  paper  is 
principally  concerned  with  system  integration;  data 
base  management,  LP  model  generation,  and  report 
writing  form  the  principal  themes  of  companion 
papers  [5],  and  so  these  topics  will  not  be  treated 
in  detail  here. 

Let  us  first  examine  the  problems  that  confront 
a  modeler  who  tries  to  use  existing  software. 
Hogan  [4]  has  described  the  difficulties  that  arose 
during  the  implementation  of  the  PIES  model  [1]. 
Despite  there  being  no  rigorous  theoretical  founda- 
tion to  the  algorithm,  its  convergence  was  found  to 
be  unexpectedly  rapid  (6  to  10  iterations  being 
required).    Hogan  was  fortunate  in  that  the  inte-. 
grating  portion  of  the  model  could  be  readily  coded 
in  FORTRAN,  and  that  APEX,  the  LP  optimizer  he  used, 
has  the  most  convenient  interface  with  FORTRAN  of 
any  commercially  available  large-scale  LP  code.  He 
has  stated  [4]  that  the  major  difficulty  in  the 
project  was  that  of  matrix  generation,  but  it  would 
probably  be  more  accurate  to  say  that  the  real 
problems  were  in  data  base  management,  as  will  be 
clear  from  a  full  account  of  the  project  [6].  Never- 
theless, although  the  combination  of  the  two  models 
did  not  constitute  the  greatest  difficulty  in  the 
project,  it  is  clear  that  the  integration  process 
was  by  no  means  simple  and  straightforward. 

The  paucity  of  published  results  on  the  success- 
ful combination  of  modeling  techniques  tends  to 
corroborate  the  view  that  implementation  of  such 
combined  models  is  difficult  in  practice.  Apart 
from  Hogan 's  work,  we  are  forced  to  draw  on  our  own 
experience  at  NBER  for  illustration  of  the  diffi- 
culties which  are  encountered.    The  major  stumbling 
block  is  that  the  econometric  system  and  the  LP 
system,  having  grown  up  separately,  do  not  have  an 
easy  way  to  exchange  data;  furthermore,  each  has  its 
own  execution  control  mechanisms  and  neither  will 
allow  itself  to  be  subservient  to  the  other.    As  an 
example,  consider  a  pilot  implementation  of  the 
least  absolute  residuals  algorithm  for  regression, 
in  which  the  data  were  set  up  and  manipulated  in 
TROLL  [7]  and  the  minimization  performed  using  the 
SESAME  LP  system  [8].    In  view  of  the  experimental 
nature  of  the  project,  it  was  decided  to  use  an 
existing  LP  system  rather  than  to  code  one  from 
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scratch,  even  though  there  are  modified  versions  of 
the  simplex  algorithm  known  which  will  solve  this 
particular  problem  more  efficiently  than  a  standard 
simplex  code.    Since  TROLL  and  SESAME  must  each 
reside  in  a  different  virtual  machine  (because  TROLL 
contains  its  own  virtual  machine  supervisor),  the 
only  means  for  passing  data  between  them  was  as 
spooled  card-image  files.    In  addition,  the  user  had 
to  control  the  whole  process  manually  from  the  con- 
sole because  there  was  no  way  to  pass  control 
between  the  two  virtual  machines.    This  latter 
difficulty  could  have  been  overcome,  at  the  cost  of 
considerable  system  programming  effort,  by  arrang- 
ing that  each  virtual  machine  could  suspend  itself 
and  activate  the  other  at  appropriate  times.  In 
the  end,  this  project  was  abandoned  because  the 
effort  involved  seemed  too  great  for  the  expected 
benefits.    TROLL  and  SESAME  were  both  designed  to 
be  flexible  systems  for  experimental  rather  than 
production  use,  and  the  problem  of  combining  their 
facilities  was  not  expected  to  be  so  difficult. 

Lest  it  be  thought  that  NBER's  systems  are  the 
only  offenders,  consider  the  way  IBM  approached  a 
similar  problem,  that  of  combining  APL  with  MPSX  [9], 
Again,  data  are  communicated  by  means  of  card-image 
files.    So  is  program  control  information  --  the 
APL  program  constructs  a  card  file  of  MPSX  control 
language  commands  and  passes  this  file  to  the  MPSX 
compiler.    This  mechanism  is  both  clumsy  and 
inefficient,  but  is  necessary  because  MPSX  accepts 
input  of  control  information  only  in  the  specific 
form  of  card  images. 

The  APL/MPSX  combination,  though  achieved  by  a 
roundabout  means  and  with  no  little  programming 
effort,  was  successful  enough  to  warrant  publication. 
Naturally,  failed  attempts  to  adapt  an  existing  LP 
system  to  a  new  environment  do  not  reach  the  litera- 
ture, but  the  author  is  aware  from  personal  con- 
tacts with  people  in  the  modeling  world  that  many 
such  attempts  have  been  envisioned.    Because  of  the 
obvious  difficulties  which  would  ensue,  few  such 
attempts  proceed  beyond  the  speculative  stage. 

We  may  conclude,  then,  that  the  state  of  the 
union  is  not  good. 

2.    Limitations  of  Current  Software 

Let  us  summarize  the  main  points  arising  from 
the  general  overview  given  in  Section  1.    The  areas 
in  which  current  software  exhibits  deficiences  are 
as  follows: 

(a)  Data  base  management  and  genera- 
tion of  LP  models  from  raw  data 

(b)  Control  of  LP  procedures  by  other 
programs 

(c)  Transfer  of  data  between  LP  proce- 
dures and  other  programs 

Category  (a)  is  treated  in  companion  papers  [5]  and 
will  not  be  discussed  here.    Categories  (b)  and  (c) 
overlap  slightly  (for  example,  control  information 
might  be  regarded  as  data  passed  to  the  LP  program), 
but  they  are  essentially  separate. 

A  major  drawback  of  current  large-scale  LP- 
solving  programs  (with  the  partial  exception  of 
APEX)  is  that  they  cannot  be  invoked  from  another 


program  written  in  a  standard  language  such  as 
FORTRAN  or  PL/I  by  a  subroutine  call  or  similar 
mechanism.    APEX  allows  its  constituent  routines  to 
be  called  from  a  FORTRAN  program  once  the  general 
APEX  environment  has  been  established.    Other  LP 
systems  can  be  invoked  only  by  means  of  a  special  , 
language  peculiar  to  each  system.    One  reason  for 
this  may  be  that  most  operating  systems  do  not  pro-' 
vide  a  suitable  dynamic  loading  capability;  LP  codes 
are  now  so  voluminous  that  they  are  too  big  to 
include  in  a  load  module  with  other  programs  of 
comparable  size.    However,  the  main  reason  appears 
to  be  that  LP  programs  are  deemed  to  require  such  a 
specialized  environment  for  efficient  operation 
that  only  the  LP  system  itself  can  be  entrusted 
with  creating  this  environment,  and  that  therefore 
the  LP  system  must  be  in  overall  control  of  the 
whole  process.    Some  progress  seems  to  have  been 
made  in  modifying  commercial  large-scale  LP  systems 
to  provide  a  more  convenient  interface  for  the 
experimental  user;  IBM  claims  that  MPSX/370  has  the; 
required  flexibility  [10].    However,  it  is  clear 
that  the  only  way  to  achieve  a  really  clean  control 
interface  is  to  design  the  LP  code  with  that  require 
ment  in  mind  at  the  outset,  rather  than  to  try  to 
adapt  an  existing  code.    This  may  entail  the  sacri- 
fice of  some  raw  computational  efficiency  in  the 
interests  of  greater  flexibility  of  application. 

Transfer  of  problem  specification  data  into  LP 
systems  is  still  essentially  at  the  BCD  stage. 
Beale  has  championed  a  "card  image"  data  interface  | 
on  the  grounds  that  such  a  medium  provides  the  maxi-j 
mum  standardization  and  flexibility  [11].    Whilst  i 
it  is  true  that  such  an  interface  is  desirable  to 
facilitate  transfer  of  data  between  computers  of 
different  manufacturers,  card  images  are  a  poor 
means  of  passing  data  between  programs  running  on 
the  same  machine.    However,  Beale's  philosophy  has 
prevailed  thus  far,  and  many  current  matrix  genera- 
tor programs,  such  as  GAMMA  [12]  and  MaGen  [13], 
produce  as  their  output  a  specification  of  the  LP 
problem  in  a  card  image  format,  usually  that  of 
MPS/360  which  has  become  a  sort  of  informal  standardi  j 
Such  a  mode  of  operation  does  have  the  advantage  j 
that  a  single  matrix  generator  can  produce  output  ; 
which  is  suitable  for  input  to  a  variety  of  differ-  i  j 
ent  LP  systems,  but  the  intermediate  data  form  was    '  ,^ 
originally  designed  for  human  readability  and  does 
not  lend  itself  readily  to  manipulation  by  the  com-  ; 
puter.    Indeed,  some  of  the  complexity  and  ineffi- 
ciency of  matrix  generators  can  be  attributed  to 
exactly  this  cause. 

Some  matrix  generators,  notably  DATAFORM  [14] 
and  DATAMAT  [8],  produce  a  specification  of  the 
problem  directly  on  a  special  file,  referred  to 
variously  as  the  "problem  file"  or  "models  file".  - 
This  allows  the  LP  to  dispense  with  its  INPUT  or 
CONVERT  routine  which  it  would  normally  use  to  con- 
vert the  card  deck  into  an  internal  representation. 
This  representation  also  facilitates  a  direct  revi-  ; 
sion  of  the  model,  which  would  otherwise  have  to  be  ji  ° 
performed  by  altering  the  matrix  generator  program,*|. 
or  its  data,  or  both,  and  executing  the  matrix  II' 
generator  again.    The  data  on  the  problem  file  a re  III 
however,  accessible  to  the  user  only  via  special  Ift' 
routines  which  themselves  depend  on  the  environmenBIl 
set  up  by  the  LP  program.  fll 
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In  a  similar  way,  output  of  solution  informa- 
tion from  typical  LP  systems  is  at  the  BCD  level. 
In  order  to  be  read  by  another  program,  such  data 
has  to  be  stored  on  a  file  in  a  format  which  is 
compatible  with  the  other  program,  and  unit  records 
of  80  characters  or  so  are  a  common  choice.  Some 
LP  systems  allow  the  option  of  filing  the  solution 
data  in  some  internal  form  on  a  "results  file"  and 
provide  system  routines  by  which  the  user  may 
access  the  data.    This  is  neater,  but  has  the  draw- 
back that  these  routines  cannot  be  used  outside 
of  the  control  scope  of  the  LP  program. 

Current  LP  programs  fail  to  distinguish 
sharply  enough  between  two  sorts  of  data  about  the 
LP  model  --  that  pertaining  to  the  structure  of  the 
model  (its  size  and  the  pattern  of  non-zero  ele- 
ments), and  the  actual  values  of  the  coefficients. 
As  explained  in  Section  1,  algorithms  for  solving 
combined  models  typically  use  the  results  of  one 
LP  run  to  revise  the  input  data  for  the  next.  In 
such  revision,  only  the  coefficients  are  changed 
and  not  the  structure  of  the  model.    Hence,  it  is 
advantageous  to  keep  these  data  entities  separate, 
and  to  make  the  coefficient  values  easily  accessible 
to  the  caller  of  the  LP  code.    Some  modern  LP  codes 
have  made  partial  provision  for  this  in  the  shape 
of  "indirect"  coefficients  --  that  is,  coefficients 
whose  values  are  initially  specified  as  character- 
string  names  which  are  later  bound  to  numerical 
values.    However,  the  mechanisms  provided  for  alter- 
ing the  values  associated  with  particular  names 
have  not  been  especially  convenient. 

In  all  of  the  foregoing,  it  is  clear  that  the 
greatest  hindrance  to  combining  a  modern  large- 
scale  LP  system  witn  an  econometric  system  of  com- 
parable complexity  is  the  insistence  of  the  LP 
system  on  setting  up  its  own  environment  and  con- 
trol mechanisms.    In  particular,  operations  on  the 
LP  data  base  are  usually  only  possible  within  the 
environment  set  up  by  the  LP  or  matrix  generator. 
The  external  interface  of  a  typical  modern  large 
scale  LP  system  has  been  designed  to  be  convenient 
for  a  human;  such  an  interface  is  not  well  adapted 
for  use  by  a  computer  program.    DATAFORM  osten- 
sibly has  the  capability  of  solving  non-linear  prob- 
lems by  an  iterative  algorithm  involving  dynamic 
revision  and  solution  of  successive    linear  pro- 
grams.   However,  MPS  III  must  remain  in  overall 
control  of  the  process,  and  although  it  is  possible 
to  invoke  from  DATAFORM  subroutines  written  In,  for 
example,  FORTRAN,  it  would  not  be  possible  to  invoke 
another  computer  system,  such  as  a  simulator. 

3.    New  Methodology 

[;  Interactive  Computing 

We  make  the  assumption,  in  considering  new 
approaches  to  dealing  with  the  problems  outlined 
above,  that  computers  will  be  used  increasingly  in 
an  interactive  mode  rather  than  in  a  batch  mode, 
especially  during  the  development  phase  of  a  pro- 
gramming project.    Although  interactive  operation 
has  little  effect,  in  principle,  on  the  computer 
I  systems  required  for  successfully  combining  LP  with 
!'  other  models,  it  greatly  affects  the  flavor  of  our 
i  approach.    It  is  assumed  here  thcit  the  reader  has 
\   had  some  exposure  to  interactive  computing,  and 
appreciates  both  the  advantages  it  affords  the  user 


and  the  difficulties  it  entails  for  the  system 
designer. 

NBER  Support  Systems 

The  Computer  Research  Center  of  NBER  has 
implemented  and  is  extending  a  set  of  system 
facilities  designed,  among  other  things,  to  aid 
the  developer  of  new  modeling  systems.    The  evolu- 
tion of  the  philosophy  behind  these  systems  ha ^ 
been  explained  previously  [15].    The  Applications 
Control  System  (ACOS)  provides  a  shared  hierarchical 
file  system,  a  supervisor,  an  input/output  manager 
for  I/O  not  connected  with  the  file  system,  and  the 
Applications  Control  Language  (ACOL).    ACOL  is  a 
language  for  writing  programs  which  interpret  com- 
mands entered  from  a  terminal  according  to  a 
programmer-defined  syntax,  and  for  controlling  the 
execution  of  program  modules  in  response  to  such 
commands.    For  more  detailed  information  on  ACOS 
and  ACOL  the  appropriate  manuals  [16]  should  be 
consul  ted . 

ACOS  supports,  as  a  subsystem,  DASEL,  which 
provides  both  a  language  for  specifying  mathematical 
models  and  a  language  in  which  to  implement  algo- 
rithms [17].    The  modeling  language,  like  that  in 
TROLL  [7]  allows  equations  to  be  entered  symbol- 
ically, but  it  goes  beyond  TROLL  in  offering 
facilities  for  symbolic  manipulation,  such  as 
differentiation,    DASEL  has  a  library  procedure 
which  will  solve  small  LP  problems,  and  also  allows 
more  powerful  LP  programs  to  be  called  by  means  of 
its  interlanguage  communication  feature.    ACOS  also 
supports  XMP,  a  flexible  subsystem  which  solves 
large-scale  LP  models. 

Analysis  of  the  Model  Combination  Problem 

In  what  follows,  we  restrict  our  attention  to 
one  particular  combination  of  models  --  that 
required  to  solve  the  equilibrium  problem  [1,3], 
This  will  make  the  discussion  clearer  and  more 
concrete,  and  the  principles  developed  will  be 
applicable  to  almost  any  combination  of  modeling 
techniques.    The  steps  in  the  solution  of  the 
equilibrium  problem  are: 

1,  Define  the  econometric  model  (EM). 

2,  Define  the  LP  model , 

3,  Specify  an  initial  solution  of  prices 
for  the  EM. 

4,  DO  while  convergence  criterion  not  met: 

a)  Compute  demands  from  EM. 

b)  Pass  demands  to  LP, 

c)  Solve  LP  to  compute  optimal 
all ocation 

d)  Retrieve  shadow  prices  from  LP. 

e)  Test  for  convergence;  if  not 
converged,  compute  revised  prices 

END  DO 

5,  Display  results. 

It  is  readily  apparent  that  we  require  the  EM  pro- 
gram, or  some  program  hierarchically  superior  to 
both  the  EM  program  and  the  LP  program,  to  be  in 
control  of  the  overall  execution  of  this  algorithm. 
The  LP  program  acts  as  a  subroutine,  albeit  an 
intricate  one  with  complex  input  and  output.  Let 
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us  now  consider  each  of  the  above  steps  in  turn  to 
see  how  they  can  be  implemented  with  the  right  soft- 
ware. 

Definition  of  the  EM 

We  assume  that  the  econometric  modeling  lan- 
guage system  has  a  means  of  allowirr  the  model  to 
be  specified  by  means  of  symbolic  equations. 

Definition  of  the  LP 

Small  LP  models  can  be  defined  in  terms  of  con- 
straint expressions  and  an  objective  with  only 
minor  extensions  to  the  EM  modeling  language.  One 
needs  to  be  able  to  attach  labels  to  constraints, 
so  that  the  objective  may  be  identified  and  so  that 
constraints  may  be  referred  to  subsequently,  and 
the  language  must  permit  the  operators  "<="  and 
">="  in  addition  to  "=".    Of  course,  the  LP  program 
must  be  able  to  accept  the  model  in  such  a  form, 
rather  than  in  a  BCD  representation;  however,  this 
requirement  is  easily  met,  since  all  the  processing 
to  do  with  parsing  the  equations  is  already  present 
in  the  EM  system.    Both  the  DASEL  library  procedure 
and  XMP  can  accept  an  LP  model  in  this  form. 

Formulation  of  a  large  LP  constraint  by  con- 
straint would  be  impossibly  cumbersome.    We  have, 
therefore,  developed  a  language  called  XML  which 
allows  an  LP  to  be  specified  as  constraints  indexed 
over  sets,  and  which  incorporates  the  Z  operator 
[5].    An  LP  model  may  be  expressed  in  XMP  in  essen- 
tially the  same  way  as  it  would  in  purely  mathe- 
matical terms,  with  some  concessions  to  the  short- 
comings of  computer  typography.    XML  allows  coeffi- 
cient values  to  be  specified  symbolically;  the 
actual  numerical  values  may  be  stored  in  a  data 
structure  (an  n-dimensional  array)  in  the  user's 
file  system,  and  a  binding  process  associates  the 
XML  symbolic  name  with  the  file  name  of  the  data 
structure.    This  is  somewhat  analogous  to  the  use 
of  indirect  coefficients  in  some  current  LP  sys- 
tems, but  it  is  more  flexible,  as  will  become 
clear  in  the  subsequent  discussion  of  model  revi- 
sion. 

Specification  of  an  initial  solution 

The  precise  details  of  this  will  depend  on 
the  EM  system,  but  any  good  system  will  make  con- 
struction of  a  vector  of  values  a  simple  process. 

Main  Interative  Loop 

If  the  language  of  the  EM  system  allows  the 
LP  program  to  be  invoked  by  subroutine  call  or 
similar  mechanism,  and  if  it  also  has  constructs 
for  testing,  branching,  and  looping,  this  itera- 
tive loop  can  be  controlled  by  the  EM  system.  Both 
TROLl  and  DASEL  provide  the  appropriate  constructs, 
but  only  DASEL  has  LP  as  a  callable  function. 

We  can  assume  that  steps  4a  and  4c  are 
achieved  simply  by  invoking  the  respective  pro- 
grams, and  that  the  EM  system  provides  the  facili- 
ties for  computing  the  revised  priced  in  step  4e. 

The  passing  of  data  in  steps  4b  and  4d  is  the 
real  crux  of  the  problem.    It  is  easily  solved  if 
the  LP  is  small  enough  to  be  expressible  in  the  EM 


language;  for  example,  if  DASEL  were  used,  the  dat, 
could  be  passed  simply  as  DASEL  variables.  Speci- 
fically, the  demands  are  DASEL  variables  to  the  EM' 
program,  which  assigns  values  to  them,  although 
they  act  as  constants  to  the  LP  program,  for  which 
they  are  coefficients  in  the  right-hand-side  (or 
bound  values).    The  LP  returns  the  shadow  prices 
(iT-vector)  as  an  explicit  vector  which  is  also  a 
DASEL  variable,  and  therefore  the  EM  program  can 
retrieve  values  from  it.    The  difficulty  with  this 
approach  is  that  it  requires  the  user  to  keep  track 
of  the  elements  in  these  vectors  by  numerical  inde> 
values;  this  is  satisfactory  for  a  small  LP  model, 
but  impractical  for  a  large  one. 

If  one  attempts  to  extend  this  approach  to 
larger  LP  models,  other  problems  arise  than  keepinc 
track  of  variables  by  index  number.    Even  with  a 
computer  which  provides  virtual  storage,  the  avail- 
able "core"  storage  may  be  exceeded,  because  the 
code  and  data  for  both  the  EM  and  the  LP  exist 
simultaneously.    Further,  the  LP  program  constructs 
its  working  data  from  the  modeling-language  descrip 
tion  of  the  LP  from  scratch  each  time;  for  a  small 
LP  this  is  a  reasonable  approach,  but  it  would 
clearly  be  grossly  inefficient  to  generate  a  large 
LP  problem  each  time  from  an  XML  specification  when 
the  only  clianges  concern  a  few  right-hand-side 
values.    Finally,  when  retrieving  solution  values 
we  wish  to  refer  to  rows  and  variables  in  terms  of 
the  XML  model  specification,  and  use  this  as  a 
selection  mechanism,  because  the  full  primal  and 
dual  solution  to  a  large  LP  would  occupy  unneces- 
sarily a  good  deal  of  memory  if  it  were  expressed 
as  unpacked  vectors. 

Efficient  Revision  of  LP  Models 

In  order  to  solve  the  problem  of  lack  of  space 
we  require  a  supervisory  program  which  passes  con- 
trol alternately  to  the  EM  program  and  the  LP  pro- 
gram.   In  this  way,  much  of  the  data  areas  and 
executable  code  may  by  overlaid.    ACOL  provides  jus 
such  a  capability. 

Unfortunately,  if  we  remove  control  from  the 
EM  program  and  demand  an  efficient  mechanism  for 
making  small  revisions  to  the  LP  model,  we  encountei 
a  problem  with  the  scope  of  the  data.    This  arises 
as  follows.    To  solve  an  XML  model,  the  user  sup- 
plies the  data  (as  n-dimensional  arrays),  specifies 
a  binding  of  XML  identifiers  to  the  data,  and 
invokes  XMP  with  the  name  of  the  XML  model  as  an 
argument.    At  this  level  of  operation,  data  entitie!; 
such  as  "right-hand-side"  which  are  internal  to  the 
LP  system  are  not  accessible,  and  so  values  in  them 
cannot  be  changed.    It  would  be  possible  to  allow 
access  to  these  internal  data  if  either  they  were 
preserved  in  core  between  calls  to  the  LP  code 
(ACOL  permits  this),  or  if  the  data  were  saved  on 
the  file  system,  but  both  these  approaches  are  open 
to  objection. 

The  good  solution  to  the  problem  lies  in  takinc, 
advantage  of  the  hierarchical  structure  of  XMP, 
which  was  designed  to  facilitate  just  such  use. 
The  top-level  procedure  XLPSOLVE  consists  of  calls  . 
to  the  second-level  procedures  GETPROB  (which  con-  ' 
verts  the  problem  from  XML  form  to  internal  form) 
and  SOLVE  (which  performs  simplex  iterations). 
Internal  data  are  accessible  at  this  level,  and  so 
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lall  revisions  to  the  right-hand-side  values,  for 
(ample,  are  easy;  system  routines  are  provided  for 
lis,  in  order  to  maintain  modularity  in  the  Parnas 
18]  sense.    The  particular  right-hand-side  ele- 
?nts  being  changed  may  be  referred  to  by  the  name 
■  the  corresponding  constraint  expression,  so  that 
le  user  need  not  be  concerned  with  numerical 
idices.    The  conventions  governing  the  use  of 
imes  are  explained  in  the  section  "Retrieval  of 
?sults".    We  can  then  rewrite  Step  2  of  the  algo- 
ithm  on  page  3  as : 

2.    a.    Define  the  LP  in  XML. 

b.  Invoke  GETPROB  to  convert  to  internal 
form. 

id  Step  4c  as: 

c.  Invoke  SOLVE. 

)serve  that  GETPROB  and  SOLVE  can  be  invoked 
idependently  of  XLPSOLVE,  something  which  is  not 
)ssible  with  current  commercial  LP  systems.    It  is 
)ssible  in  XMP  because  data  entities  are  passed  as 
■guments,  and  therefore  no  routine  is  required  tc 
it  up  a  special  environment  in  which  these  proce- 
ires  operate. 

itrieval  of  LP  Results 

The  particular  problem  we  are  considering 
jquires  only  the  retrieval  of  a  few  shadow  prices. 
I  general,  one  may  assume  that  the  solution  values 
)St  often  requested  will  be  primal  and  dual 
ilues,  and  that  tableau  values  will  be  required 
ich  less  frequently;  hence,  XMP  makes  primal  and 
lal  values  especially  accessible.    The  internal 
irm  of  the  final  basis  is  also  made  accessible 
I  that  it  can  be  saved  and  used  to  initiate  a 
iture  execution  of  SOLVE. 

As  with  revision  of  a  model,  the  scope  of 
ita  presents  a  problem  with  retrieval  of  results, 
iwever,  we  can  overcome  the  problem  by  using  a 
lecial  feature  of  ACOL.    When  simplex  iterations 
ive  finished,  SOLVE  makes  a  recursive  call  to 
;0L  (recursive  because  ACOL  invoked  SOLVE  in  the 
I  the  first  place).    SOLVE  is  still  active  and 
lerefore  all  its  internal  data  are  preserved.  In 
isponse  to  the  recursive  call,  the  ACOL  inter- 
~eter  starts  to  read  from  whatever  environment 
ivoked  SOLVE.    In  this  particular  case,  SOLVE 
IS  invoked  from  an  ACOL  program,  and  so  more  of 
lis  program  is  read.    The  statements  that  are 
sad  are  requests  to  SOLVE  for  solution  informa- 
on  which  are  implemented  by  the  ACOL  program; 
;  calls  entry  points  in  the  SOLVE  program  which 
iturn  solution  values.    The  solution  values  are 
len  passed  via  ACOL  to  the  EM  program.  There 
i  an  explicit  ACOL  request  to  unwind  the  recur- 
on  and  leave  the  SOLVE  environment. 

When  retrieving  solution  information,  the  user 
ist  be  able  to  refer  to  the  constraints  and  vari- 
)les  of  the  LP  model  by  meaningful  names.  With 
inventional  matrix  generator  systems  this  is 
inceptually  easy,  since  all  these  systems  require 
lat  the  user  specify  these  names  precisely  in  the 
irm  of  6-  or  8-character  concatenations.  Normally, 
lese  names  will  have  been  generated  in  some 
igular  fashion  which  allows  selection  of  the 


desired  solution  information  to  be  achieved  by  a 
masked-name  matching  technique.    However,  XML  has 
no  concept  of  names  of  this  type;  its  names  refer 
to  whole  classes  of  constraints  and  variables,  with 
individual  members  of  a  class  corresponding  to 
particular  values  of  elements  of  indexing  sets. 

The  style  of  naming  adopted  by  XML  is  very 
convenient  for  selecting  classes  of  elements  of  the 
model.    In  the  particular  case  with  which  we  are 
concerned,  suppose  the  set  of  demand  constraints  is 
named  DEMAND  and  is  indexed  over  regions  of  the 
country  and  over  type  of  energy;  these  index  sets 
might  be    "NE" , "MA" , . . . , "NW"    (for  New  England, 
Middle  Atlantic ,North  West)  and 

"EL","NG,...,"RO"    (for  electricity, 
natural  gas , . . .  ,residual  oil),  respectively.  Then 
the  whole  class  of  constraints  is  referred  to  by 
the  name  DEMAND;  cross-sections  would  be  denoted 
by,  for  example,  DEMAND( "NE" , )  or  DEMAND( , "NG" ) ; 
and  a  single  constraint  by  a  name  such  as 
DEMAND("NW","EL").    The  ACOL  request  to  retrieve 
the  shadow  prices  for  the  demands  might  be  some- 
thing like: 

PI  ("DEMANDS", "PRICES") 

where  PI  is  the  request.    The  quotes  denote  that 
each  argument  is  a  literal  character  string;  PRICES 
is  the  name  of  a  DASEL  variable  in  which  the  values 
of  the  shadow  prices  will  be  returned.    The  values 
are  returned  in  the  same  order  as  the  members  of 
the  corresponding  index  sets,  and  are  then  available 
to  the  EM  program  for  further  computation. 

Reporting  of  Final  Results 

Since  the  LP  program  is  essentially  a  sub- 
routine in  the  whole  process,  the  EM  program  has 
the  primary  responsibility  for  reporting  the  final 
results.    Both  TROLL  and  DASEL  have  extensive 
facilities  for  manipulating  data  and  displaying  data 
in  both  tabular  form  and  as  plots.    It  is  apparent, 
however,  that  the  result  retrieval  operations  dis- 
cussed above  form  a  partial  set  of  primitives  for 
the  construction  of  an  LP  report  writing  sys- 
tem. 

4.    Summary  and  Conclusions 

Although  the  discussion  in  Section  3  is  largely 
centered  around  a  particular  model  combination,  the 
principles  expounded  admit  of  ready  generalization, 
and  the  software  described  is  capable  of  dealing 
with  many  other  possible  combinations  in  a  straight- 
forward manner.    The  main  points  to  notice  are  as 
fol lows : 

(a)  ACOS  provides  a  uniform  file  system, 
thus  eliminating  one  common  source  of 
incompatibility,  and  a  dynamic  loader. 

(b)  ACOL  allows  one  to  write  code  (in  PL/I, 
FORTRAN  or  whatever)  which  may  be  invoked 
either  from  the  terminal  or  from  another 
program  without  the  code  having  to  dis- 
tinguish between  the  two.    Thus  the 
algorithmic  modules  of  a  system  such 

as  XMP  and  their  associated  ACOL  driver 
programs  are  readily  incorporated  into 
systems.    Also,  ACOL  allows  error 
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conditions  to  be  treated  in  a  uniform 
manner  and  at  the  appropriate  level. 

(c)  DASEL  provides  a  suitable  language  in 
which  to  specify  econometric  models,  and 
algorithmic  procedures  for  operations 
such  as  regression  and  simulation.  It 
also  has  facilities  for  maintaining  a 
data  base,  manipulating  data,  and  dis- 
playing data. 

(d)  XML  is  a  convenient  language  for  speci- 
fying large-scale  LP  models,  and  XMP  is 
a  subroutine  capable  of  solving  such 
model s . 

Our  view  is  that  extremely  flexible  applications 
software  results  if  carefully  structured  libraries 
of  procedures  are  available  to  be  connected  together 
as  a  user  desires  under  the  general  environment  of 
a  suitably  tailored  operating  system. 
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ABSTRACT 

This  paper  describes  the  fundamental 
concepts  of  data  base  management  and 
proceeds  to  suggest  the  utility  of 
these  concepts  for  the  data  handling 
aspects  of  mathematical  programming. 
A  general  network-based  data  manage- 
ment system  is  used  to  Illustrate 
data  structure  definition,  data  ma- 
nipulation and  non-procedural  query 
capabilities  with  respect  to  math 
programming.    The  emphasis  is  upon 
flexibility  and  convenience  for  both 
the  implementors  and  users  of  mathe- 
matical programming  algorithms. 

Implementation  of  mathematical  programming 
algorithms  entails  the  storage  and  manip\ilat  ion 
of  large  volumes  of  data.    Our  principal  objective 
is  to  suggest  ways  in  which  both  the  implementors 
and  users  of  mathematical  programming  may  benefit 
from  developments  in  the  growing  field  of  data 
base  management.    For  implementors,  data  base 
management  can  provide  powerful  data  storage 
mechanisms  and  facile  data  manipulation  capabil- 
ities.   The  storage  mechanisms  are  powerful  with 
respect  to  their  capacities  to  handle  complex  data 
relationships  and  both  numeric  and  non-numeric 
data  within  a  single  data  structure.    The  data 
manipulation  capabilities  allow  access  to  the 
data  without  concern  for  overlaying  or  management 
of  auxiliary  storage  devices.    For  users,  data 
base  management  can  provide  enhanced  data  main- 
tenance facilities  including  data  editing  and 
problem  formulation. 

[         We  commence  with  a  review  of  important 
I  data  base  terminology  and  concepts.    These  axe 
utilized  to  illustrate  specific  ways  in  which 
various  issues  of  mathematical  programming  may 
be  treated  with  data  base  management  techniques. 
So,  the  emphasis  is  on  data  handling  rather  than 
mathematical  or  computational  details.  Naturally, 
the  development  of  an  algorithm  will  exploit  the 
the  available  data  structure  and  data  handling 
capabilities.    As  will  be  noted  shortly  there  are 
several  varieties  of  data  bases;  the  network  vari- 
ety, which  is  the  most  flexible  and  general,  will 
be  used  for  illustrative  purposes.    A  langxiage  is 
presented  which  allows  full  manipulative  capabili- 
ties, with  respect  to  data  organized  according  to 
network  structures.    This  language  is  completely 
independent  of  the  types  of  data  that  are  stored 


and  it  may  be  used  within  the  confines  of  such 
commonplace  languages  as  FORTRAN,  ALGOL,  or  COBOL. 
An  example  is  provided  demonstrating  how  this  data 
manipulation  language  can  be  used  to  accomplish 
the  pivot  operation.    Finally,  a  high-level,  Eng- 
lish-like, non-procedural  query  language  is  de- 
scribed.   The  query  language  is  independent  of  the 
data  structure  and  automatically  interfaces  appli- 
cation routines  with  desired  data. 

DATA  BASE  CONCEPTS 

A  data  base  has  two  major  attributes:  its 
logical  structure  and  a  collection  of  data  values 
which  are  stored  according  to  that  structure.  The 
logical  structure  (also  called  the  schema)  is  ef- 
fectively a  blueprint  describing  what  types  of  data 
values  may  be  included  in  the  data  base  and  how 
each  of  these  types  is  related  to  the  other  types. 
The  schema  serves  as  the  basis  for  all  data  mani- 
pulation; addition,  deletion,  retrieval,  or  modi- 
fication of  a  data  value  is  accomplished  by  refe- 
rences to  its  corresponding  type  in  the  logical 
structure . 

Every  data  base  structure  can  be  described  in 
terms  of  three  basic  features:    data  item  types, 
record  types,  and  sets.    A  particular  schema  is 
composed  of  data  item  types  which  are  related  to 
one  another  by  record  types  and  by  sets.    So  data 
item  types  are  considered  to  be  the  most  fundamen- 
tal "building  blocks"  of  a  logical  structure.  Each 
data  item  type  is  identified  by  a  unique  name  and 
indicates  a  distinct  type  of  data.    For  instance 
each  of  the  data  item  types  VARIABLE-ID,  VARIABLE- 
DESCRIPTION,  CONSTRAINT-ID,  CONSTRAINT-DESCRIPTION, 
COEFFICIENT-VALUE,  and  COEFFICIENT-SOURCE  indicates 
a  distinct  type  of  data  which  we  may  want  to  in- 
clude in  a  data  base.    Each  data  item  type  repre- 
sents many  occurrences  of  data  values  of  that  type 
within  the  data  base.    It  is  important  to  under- 
stand that  a  data  item  type  is  a  component  of  the 
schema,  whereas  a  data  value  is  an  actual  instance 
of  data  of  some  type .    For  example ,  the  data  item 
type  VARIABLE-ID  may  represent  the  data  value  oc- 
currences 'XI'  through  'XlOO'. 

A  logical  structure  is  formed  by  relating  the 
various  data  item  types  with  each  other.    There  are 
two  varieties  of  relationship:    aggregation  and 
association.    A  record  type  is  a  named  aggregate 
of  data  item  types.    For  example,  the  record  type 
VARIABLE  may  be  composed    of  the  data  item  types 
VARIABLE-ID  and  VARIABLE-DESCRIPTION.    Figure  la 
gives  a  pictorial  representation  of  this,  where  the 
record  type  is  shown  as  a  rectangle  labeled  VARIA- 
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BLE  and  enclosing  the  names  of  its  data  item  types. 
Just  as  a  record  type  is  an  aggregate  of  data  item 
types,  an  occurrence  of  a  record  type  is  an  aggre- 
gate of  data  value  occarrences  (i.e.,  one  data  value 
occurrence  for  each  of  the  data  item  types  which 
constitute  the  record  type).    A  sample  occurrence 
of  the  record  tj-pe  VAEIABLE  is  'XI'  and  'AMOUNT  OF 
RESOURCE  5  TO  BE  USED. ' 


VAEIABLI: 
VABIABLE-ID 

VARIABLE- 

DESCHIPTIOB 



VARIABLE 


VAEIABLE-ID 

VASIABLE- 
DESCFIPTION 


COEFFICIENT 

\ 

HAS 

COEfTICIEHT-VAUIE 
COEFFI CIENT -SOURCE 
CURREKCY 

b. 


Figure  1.  Example  of  Structural  Components 
of  a  Data  Base 


The  second  kind  of  relationship  allows 
record  types  (and  therefore  item  types)  to  be 
associated  with  each  other  by  means  of  a  set  rela- 
tion.   This  notion  of  a  set  was  popularized  by  the 
CODASYL  Data  Base  Task  Group  (DBTG)  Report  of  1971 
[1].    The  DBTG  notion  of  a  set  must  not  be  con- 
fused with  the  concept  of  a  mathematical  set;  there 
is  no  direct  relation  between  the  two.    A  set  is 
described  in  terms  of  an  owner  record  type  and  a 
member  record  type  such  that  there  is  a  one-to- 
raany  relationship  between  each  occurrence  of  the 
owner  type  and  occurrences  of  the  member  record 
type.    But  although  there  may  be  many  (or  one  or 
zero)  member  occurrences  associated  with  each 
owner  occurrence ,  for  a  given  set  a  particular 
member  occurrence  may  be  associated  with  no  more 
than  one  occurrence  of  the  owner  record  type. 
Consider  the  record  types  COEFFICIENT  and  VAEI- 
ABLE shown  in  Figure  lb,  where  the  former  is  an 
aggregation  of  such  data  item  types  as  COEFFI- 
CIENT-VALUE and  COEFFICIENT-SOURCE.     The  two 
reocrd  types  are  related  via  the  set  named  HAS. 
This  set  is  denoted  by  an  arrow  pointing  from  its 
owner  record  type  (VARIABLE)  to  its  member  record 
type  (COEFFICIENT).    Recalling  the  definition  of  a 
set ,  we  observe  that  each  occurrence  of  VARIABLE 
may  have  many  occurrences  of  COEFFICIENT  associated 
with  it,  but  a  given  occurrence  of  COEFFICIENT  may 
be  associated  with  no  more  than  one  occurrence  of 
VARIABLE.    Thus  the  set  HAS  properly  describes  the 
relation  between  variables  and  coefficients  in  the 


context  of  linear  programming.     If  we  reverse  the 
direction  of  the  arrow  in  Figure  lb,  then  the  stru 
ture  no  longer  supports  the  relation  between  coeff ' 
cients  and  variables  which  is  inherent  in  lin- 
ear programming.     Thus  an  important  issue  in  data 
base  management  is  the  design  of  a  schema  that  cor 
rectly  reflects  the  relationships  among  item  types 

Not  only  does  a  set  provide  information  about 
relationships  among  occiorrences  of  its  owner  and 
member  record  types,  but  it  also  permits  the  mem- 
ber occurrences  associated  with  each  owner  occur- 
rence to  be  logically  ordered  (for  purposes  of 
storage  and  access)  according  to  some  criterion. 
For  example ,  given  an  owner  occurrence  of  the  set 
HAS,  its  member  occurrences  may  be  ordered  in  an 
ascending  fashion  according  to  the  values  of  thein 
respective  COEFFICIENT-VALUE  occurrences. 

Figure  2  displays  an  example  of  record  occur- 
rences organized  according  to  the  schema  of  Figure: 
lb.    Three  expressions  are  shown  having  five  vari-, 
ables  and  a  total  of  twelve  coefficients.    In  the 
diagram  beneath  the  expressions,  circles  denote 
record  occurrences  of  the  type  shown  in  the  right  , 
margin.    The  arrows  show  which  member  occurrences  ' 
are  owned  by  each  owner  occurrence  via  the  set  in-^ 
dicated  in  the  right  margin.    For  purposes  of  dia-i 
grammatic  clarity,  record  occurrence  details  are 
not  shown;  e.g.  no  data  values  of  VARIABLE-DESCRIF. 
TION  are  included  in  occurrences  of  the  VARIABLE 
record  type.    Also  notice  that  no  zero  coefficient 
are  stored.    The  diagram  clearly  displays  the  idea 
of  a  set.    Each  owner  record  type  may  be  associate 
with  many  member  record  occurrences,  but  no  member 
occurrence  is  associated  with  more  than  one  owner 
occurrence . 

Expression 

El  =  100X1  +  50X2  +  30X3  -  70X14  -  30X5 

K2  o  -  10X2  +  13x3  -  ICXU  -  90x5 

E3  »  12X1                        +    3Xlt  +  10x5 


KKC 

SET  Tlffi 


COE! 


Figure  2.  An  Occurrence  Level  Exairple 
Based  on  the  SchciLa  of  Figure  lb. 

It  is  readily  apparent  that  the  structiire  of 
Figures  lb  and  2  does  not  distinguish  between  co- 
efficients of  one  expression  and  those  of  another. 
In  a  subsequent  section,  we  deomonstrate  a  manner 
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in  which  this  shortcoming  can  be  remedied. 

A  TAXONOMY  OF  DATA  BASE  STRUCTURES 

Utilizing  the  concepts  and  terminology  intro- 
duced in  the  preceding  section,  we  can  illustrate  .• 
the  varieties  of  data  base  structures  which  can  . 
exist.    The  most  elementary  data  structure  is  com- 
posed of  a  single  record  type .    This  is  analogous 
to  a  FORTRAN  array.    A  slightly  more  complex  circum- 
stance occurs  when  data  values  are  organized  into  a 
number  of  disjoint  (i.e.,  no  set  relationships)  re- 
cord types;  this  corresponds  somewhat  to  a  group  of 
FORTRAN  arrays.    Figure  3a  shows  a    linear  structure; 
this  means  that  each  record  type  is  the  owner  of  at 
nost  one  set  and  is  the  member  of  no  more  than  one 
set.    Figure  lb  is  an  example  of  a  linear  structure. 
\  data  base  may  be  composed  of  several  disjoint  lin- 
ear structures.    A  more  flexible  structure,  the 
tree,  is  depicted  in  Figure  3^.    Observe  that  in 
this  type  of  structure  a  record  type  may  be  the 
Dwner  of  more  than  one  set,  but  it  can  be  the  mem- 
ber of  at  most  one.    Thus  there  is  a  xinique  path 
oetween  amy  two  record  types.    Figure  3c  shows  a 
network  data  structure.    This  is  the  most  general 
Df  all  data  structures  that  are  describable  in 
terms  of  the  three  features:    data  item  type,  re- 
cord type  and  set .    That  is ,  the  record  type ,  lin- 
2ar  structure  and  tree  are  all  special  cases  of  a 
network . 


of  a  path  contains  no  reference  to  direction.  The 
direction  of  arcs  (sets)  within  a  path  offers  no 
impediment  to  the  traversal  of  record  types.  Fig- 
ure 1+a  presents  a  special  type  of  path  which  may 
exist  in  a  network  structure.    The  path  between 
VARIABLE  and  EXERESSION  consists  of  the  two  sets 
m.S  and  CONTAINS,  related  through  the  record  type 
COEFFICIENT.    A  path  of  this  sort  is  used  to  indi- 
cate a  many-to-many  relationship  between  variables 
and  expressions.    That  is,  a  variable  participates 
in  many  expressions  and  an  expression  contains  many 
variables.    But  each  occurrence  of  COEFFICIENT  is 
'owned'  by  one  occurrence  of  VARIABLE  (via  the  set 
has)  and  by  one  occurrence  of  EXPRESSION  (via  the 
set  contains).    It,  therefore,  relates  a  variable 
and  an  expression  without  violating  the  definition 
of  a  set.    This  is  exemplified  in  Figure  kh,  using 
the  expressions  of  Figure  2. 
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Figure  3-  Varieties  of  Data  Strucutre 

Since  a  record  type  in  a  network  may  be  the 
Eember  of  many  sets  and  also  the  owner  of  many  sets, 
nultiple  paths  may  exist  between  two  record  types, 
rhe  term  path  indicates  a  sequence  of  sets  which 
ire  related  via  record  types,  such  that  any  two  re- 
cord types  on  the  path  are  related  by  a  unique  sub- 
sequence of  the  path's  sets(i.e.,  the  path  contains 
10  loops).    It  must  be  emphasized  that  although  a 
path  is  composed  of  directed  arcs,  the  definition 
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Figure  h.  Example  of  a  Many-to-Mar^  Relationship 


LANGUAGE  FOR  DATA  BASE  DEFINITION 

We  preface  this  section  with  a  few  words  of 
caution.    It  cannot  be  asserted  too  strongly  that 
the  diagrams  in  Figures  1  through  k  are  not  flow 
charts  or  PERT  diagrams;  they  can  in  no  way  be  con- 
sidered to  represent  algorithms.    As  aljready  stated, 
the  diagrams  with  rectangles  depict  logical  struc- 
tures according  to  which  data  may  be  organized  and 
a  diagram  with  circles  gives  specific  examples  of 
data    values  which  have  been  organized  on  the  basis 
of  some  schema.    The  diagrams  of  logical  structures 
provide  a  simple  mechanism  for  representing  and  com- 
municating about  a  data  base.    However,  they  are  by 
no  means  simplistic,  but  rather  quite  powerful. 
This  power  arises  from  the  capacity  of  a  logiceil 
structure  to  support  the  storage  and  access  of  data 
values,  in  such  a  way  that  a  data  base  user  need 
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not  be  explictly  concerned  with  the  complicating 
factors  of  physical  storage  and  access.    For  in- 
stance, the  pairticular  data  base  management  system 
that  is  examined  in  this  paper  is  implemented  in 
terms    of  doubly  liiU^ied  lists  vihich  are  stored  on 
a  direct  access  device.    The  physical  storage  and 
retrieval  of  data  for  this  system  requires  extensive 
pointer  manipulation  and  the  buffering  of  blocks  of 
data  from  the  direct  access  device.    However,  the 
typical  data  base  user  is  concerned  with  none  of 
this;  physical  manipulations  are  automatically  ac- 
complished by  submitting  requests  in  terms  of  the 
logical  structure. 

The  logical  structure  of  a  data  base  is  formal- 
ly defined  via  a  Data  Description  Language  (DDL). 
We  illustrate  this  by  using  the  DDL  of  a  particular 
data  management  system:    the  Generalized  Planning 
System(GPLAN)  developed  at  Purdue  University.  The 
DDL  for  the  data  structure  of  Figxire  ka.  is  shown 
in  Figure  5.    In  this  version  of  DDL  all  identifiers 
ajre  restricted  to  four  characters  in  length.  As 
the  figure  indicates,  the  DDL  is  non-procedural. 
Each  record  type  is  followed  by  the  item  types  which 
compose  it.    Each  item  type  is  defined  in  terms  of 
its  name,  its  type  (e.g.  integer,  character,  real) 
and  its  size.    The  size  of  a  character  item  type  is 
given  by  its  number  of  characters;  other  size  spec- 
ifications are  in  terms  of  computer  words.    The  DDL 
description  of  a  schema  serves  as  input  to  a  DDL 
Analyzer  which  produces  schema  tables.    These  ta- 
bles are  subsequently  used  by  the  Data  Manipulation 
Language  (DML)  for  creating  and  accessing  record 
occurrences. 
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allow  the  data  base  user  to  perform  the  following 
types  of  functions:    l)  open  and  close  the  data 
base;  2)  create  and  delete  record  occurrences;  3) 
set  currency  indicators;  k)  add  and  remove  record 
occurrences  from  sets;  5)  store  and  retrieve  data  ! 
from  record  occurrences;  and  6)  search  through  set(| 
for  particular  record  occurrences.     In  the  GPLAN 
implementation  each  DML  command  is  a  call  to  a 
FORTRAN  subroutine.    Thus  the  DML  can  be  considerec 
as  an  extension  to  the  FORTRAN  language.    This  ex- 
tended FORTRAN  provides  for  all  varieties  of  data 
structures  and  furnishes  powerful  data  handling 
capacities. 

In  order  to  demonstrate  DML  commands  and  how 
they  may  be  used,  a  detailed  discussion  is  providec| 
of  a  FORTRAN  subroutine  which  executes  a  pivot  op-'j 
eratlon.    This  subroutine  assumes  that  a  matrix  hafi 
been  stored  in  a  data  base  whose  logical  structure 
is  shown  in  Figure  6a.     Figure  6b  gives  the  DDL  of  'i 
this  structure .    We  must  point  out  that  no  storage 
is  allocated  for  any  zero  valued  coefficient  befor«| 
during  or  after  the  pivot  operation.    Figure  7  dis^, 
plays  the  pivot  subroutine  with  input  arguments  for 
the  variable  leaving  the  basis  (BVAR)  and  the  vari-i 
able  entering  the  basis  (NBVAR). 
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Figure  6.  Schema  and  DDL  Used  by 
Al«arlthjn  of  H(Cure  7. 


Figure  5«  Example  of  Schema  Specified 
In  Terms  of  DDL 


DATA  MANIPULATION  LANGUAGE 

The  GPLAN  Data  Manipulation  Language  [2],  [3] 
consists  of  approximately  seventy  commands  which 


Hollerith  arguments  of  DML  commands  refer  to 
data  item  types,  record  types  or  sets  that  are  de- 
fined in  the  schema;  all  other  arguments  are  FOR- 
TRAN variables.    The  FMSK  command  (Find  Member  basf 
on  Sort  Key)  locates  the  member  of  the  set  SB  whosf 
sort  key  (the  data  item  type  NAMB)  has  the  value 
given  in  the  FORTRAN  variable  BVAR.    Each  record 
occurrence  in  the  data  base  is  identified  by  its 
own  unique  data  base  key.    GKM  (Get  Key  from  Membe: 
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gets  the  key  of  the  member  of  the  set  SB  (i.e.,  the 
member  which  has  Just  been  located  by  FMSK)  and 
places  that  value  in  the  FORTRAN  variable  KEYB. 
Next,  SFM  (Set  Field  based  on  Member)  sets  the  val- 
ue of  NAME  in  the  member  occurrence  of  SB  to  the 
value  of  the  FORTRAN  variable  NBVAR.    In  these 
three  commands,  we  have  removed  BVAR  from  the  set 
of  basic  variables,  and  we  have  entered  NBVAR  into 
the  basis.     Similarly  the  next  three  commands  re- 
move NBVAR  from  the  set  of  non-basic  variables  and 
replaces  it  with  BVAR;  KEYB  now  contains  the  key 
of  the  formerly  basic  variable  and  KEYN  is  the  key 
of  the  formerly  non-basic  variable. 

Now  we  find  and  update  the  pivot  element.  The 
command  SOM  (Set  Owner  of  one  set  based  on  the  Mem- 
ber of  another  set)  sets  the  owner  of  the  set  ROW, 
based  on  the  current  member  of  the  set  SB.  Recall 
that  the  current  member  of  the  set  SB  is  the  form- 
erly basic  variable.    As  a  result  of  the  SOM,  the 
current  member  of  ROW  is  an  occiurrence  of  the  record 
type  CQSF.    The  next  two  commands  are  used  to  deter- 
mine if  this  coefficient  was  a  member  of  the  COL 
(column)  owned  by  the  formerly  non-basic  variable. 
SMM  sets  the  member  occurrence  of  COL  based  on  the 
current  member  of  ROW.    Since  a  set  is  a  strictly 
one -to-many  relationship,  having  a  member  of  COL 
immediately  gives  us  its  unique  owner.    GKO  gets 
the  key  of  this  owner  and  returns  it  in  the  varia- 
ble KEYC.    We  check  KEYC  against  KEYN  to  find  if  we 
have  located  the  coefficient  which  was  the  member 
of  the  COL  owned  by  the  previously  non-basic  vari- 
able.   If  this  is  not  the  case,  then  we  call  FNM 
(Find  Next  Member)  which  locates  the  next  occ\ir- 
rence  of  coefficient  that  was  in  the  ROW  of  the 
formerly  basic  variable.    If  there  is  no  next  member 
then  an  error  has  occurred.  If  a  next  coefficient 
of  the  formerly  basic  row  is  found,  then  as  before, 
we  determine  with  which  column  that  coefficient  is 
associated.    Upon  finding  the  pivot  element,  we  up- 
date it  appropriately  and  retrieve  its  key  (i.e., 
the  address  of  its  location  in  the  data  base). 

In  order  to  update  coefficients  in  the  pivot 
column,  we  first  set  the  owner  of  COL  based  on 
KEYN  by  using  the  SOK  command.     Usage  of  the  other 
commands  appearing  in  this  procedure  has  already 
been  illustrated  with  the  exception  of  DRM  (De- 
lete Record  based  on  Member).    As  can  be  seen 
from  the  code,  if  the  updated  value  (y)  of  a  co- 
efficient is  zero  then  DRM  is  used.    This  command 
deletes  the  current  member  of  COL  (i.e.,  the  oc- 
currence of  COEF  whose  updated  value  would  have 
been  zero). 

Upon  _,udating  an  element  in  the  pivot  column 
we  proceed  to  update  each  coefficient  in  the  col- 
umn element's  row.    The  commands  for  this  are  much 
the  same  as  those  given  above.    There  are  two  new 
points  here.    FFM  finds  the  first  member  of  a  spe- 
cified set;  if  FNM  returns  a  negative  value  in  the 
lERR  variable  then  there  are  no  more  members  in  the 
set  being  considered.    As  the  code  indicates,  for 
each  non-zero  coefficient  in  the  pivot  row,  we  up- 
date the  corresponding  coefficient  in  the  pivot  col- 
umn element's  row.    Now  if  there  is  no  such  corre- 
sponding coefficient,  (i.e.,  the  coefficient  was 
zero)  then  one  must  be  created  and  given  the  proper 
value.    The  command  CR  creates  a  record  occurrence 
of  the  type  specified  in  its  first  argument.  So 
statement  60  allocates  storage  for  a  record  occ\ir- 
rence  of  the  type  COEF;  KEYNB  is  returned  from  CE 
containing  the  unique  data  base  key  for  the  created 
record  occurrence.    The  next  command,  SFR  (Set 


Field  of  Record),  inserts  the  value  of  X*Y  into 
the  VAL  field  (i.e.,  data  item  type)  of  the  just 
created  record  occurrence  of  COEF.    Finally  this 
record  occurrence  must  be  associated  with  the  pro- 
per row  and  column  through  the  sets  ROW  and  COL. 
This  is  accomplished  by  AMS  (Add  Member  to  Set). 
For  instance ,  the  final  AMS  shown  adds  the  record 
occurrence  of  COEF  to  the  appropriate  owner  occur- 
rence of  the  set  COL. 

If  there  is  no  need  to  create  an  occurrence  of 
COEF,  then  the  existing  tableau  element  is  updated 
in  a  straightforward  manner.    If  the  updated  value 
is  zero  then  the  record  occurrence  of  COEF  is  dele- 
ted; the  space  which  it  has  occupied  is  returned  to 
a  pool  of  available  space  for  future  uses  of  the  CR 
command.     Finally  after  coefficients  of  the  row 
associated  with  each  pivot  column  element  (except 
for  the  pivot  element  itself)  are  updated,  the  pi- 
vot row  coefficients  are  properly  adjusted. 

The  purpose  of  the  foregoing  exajnple  is  to  be 
be  suggestive  of  the  nature  and  capacities  of  a 
DML.    A  comprehensive  description  of  the  GPLAN  DML, 
including  physical  implementation  and  operation,  is 
given  in  [2].    Observe  that  the  DML  obviates  any 
need  for  overlays  or  paging  with  respect  to  data 
storage .    In  the  preceding  example  there  was  no 
need  to  use  an  array.    In  addition  the  programmer 
is  not  required  to  have  explicit  knowledge  of  any 
physical  pointers.    In  the  pivot  example,  the  pro- 
grammer never  sees  a  physical  pointer;  all  data 
manipulation  is  dealt  with  on  a  logical  level  using 
-the  logical  structure  of  the  schema.    Each  DML  com- 


SUBROUTINE  PIVOT  (EWAR.NBVAK) 

INTEGER  BVAR 

MTA  EPSLON/l.E-8/ 

TAKE  BVAB  OOT  OF  BASIS,  REPUCE  WITH  NEVAR 


:OEF  . 


CALL  mSK  (2HSB,     BVAR,  lERR  ) 

CALL  GKM    {?XB,     KEYB,  lERR ) 

CALL  SFM    (UhNAMB,  2HSB,  KBVAB,  lERH) 

CALL  FMSK  (2H3;,     NEVAR,  lERR) 

CALL  GKM    {2K;N,     KEYN,  lERR) 

CALL  Sm    (I4HNAMN,  2HSN,  BVAB,  lERB ) 

FIND  Airo  UPDATE  PIVOT  ELEf-ENT 

CALL  SOM  (3HR0K,  2HSB,  lERB ) 
10    CALL  SUM    (3HC0L,  3HRCW ,  lERE) 

CALL  CKO    (3HC0L,  KEYC,  lERE) 

IF  (KEYC.  B5.  KEYN)  GO  TO  20 

CALL  FNM  (3HE0W,  lERR) 

IF  (lERR)  999,10,999 
20    CALL  am  (3HVAL,  3HR0W,  PIV,  lERR) 

CALL  sm  (3HVAL,  3HR0W,  l./PIV,  lERR) 

CALL  GKM  (3HR0W,  KEYP,  IEBK) 

ITERATE  THRU,  UPDATE  PIVOT  COLUMN,  DIVIDE  EACH  COEFFICIENT  W  PIVDT  COLUMN  BY  PIVOT  ELEMENT 

CALL  SOK  (3HC0L,  KEYN,  lERB) 
30    CALL  GKM  (3HC0L,  KEYC,  lERR ) 

IF  (KEYC.ES  KEYP)  GO  TO  105 

CALL  cm  (3WAL,  3HC0L,  Y,  lERR) 

Y  >  -Y/nv 

IF  (AES(Y)  .OT.EPEliDN)  CO  ID  35 
CALL  DRM  (3HC0L,  lERH) 

Y  =  0.0 
GO  TO  30 

35  CALL  SFM  (3HVAL,  3HC0L,  Y,  lERR) 

DETERMINE  KEY  OF  TttlS  COLUMN  ELEMENT'S  ROW  (CER) 

36  CALL  SMM  C3HB0W,  3HC0L,  lESB) 
CALL  GKO  (311RCH,  KEYB,  lERR) 

Figure  7.  Pivot  vlth  HO, 
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NOW,  ITERATE  THRU  THE  PIVOT  BOH 
CALL  SOK  OlOlOH,  KEYB,  lEIW) 

"40  CALL  SUM  (3I!C0L,  3HH0H,  lERR) 
CALL  CKM  (JliCOL,  KEVrR,  lEKI ) 
IF  (KEYIH,  Bi.  KJrifP)  GO  TO  95 
CALL  cm  (31IVAL,  3HC0L,  X,  lERB) 

DOES  CER  HAVE  AN  ENTRY  IN  THIS  COLUMN 

CALL  FfM  (3nC0L,  lERR) 
50    CALL  SMM  (3IIHOW,  liHCOL,  lERR) 

CALL  GKO  (3HR0'.I,  KEY,  lERB) 

IF  (KirY.EQ.  KBYR)  CO  TO  70 

CALL  FNW  (3HCOL,  lERR) 

IF  (lERR)  60,  50,  999 

NO  CORRiSPOTOINC;  ENTRY  -  CREATE  ONE 
60    CALL  CR  (llHCOEF,  KEYDB,  lERR) 

CALL  SFB  (3HVAL,  llHCOEF,  X'Y,  lERR) 
CALL  AMS  (2HSC,  ItHCOEF,  lERR ) 
CALL  SOK  (3HR0W,  KEYR,  lEHR) 
CALL  AMS  (3HRCW,  UHCOEF,  IERB ) 
CALL  AMS  (3HC0L,  llHCOEF,  lERR) 
GO  TO  90 

UPDATE  TABLEAU  ELEMENT 
70    CALL  GfM  (3IIVAL,  3HC0L,  W,  lERR) 
W  =  W  ♦  X<Y 

IF  (ARS  (W).  LT.  EPSLON)  GO  TO  OO 
CALL  SFM  (3ir.'AL,  3HC0L,  W,  lERR) 
00  TO  90 

DELCTE  ZERO  OCCURRENCE 

80  CALL  DRM  (31ICOL,  lERR) 

GO  ON  TO  NEXT  ELtKEiri'  IN  PIVOT  ROW 
90    CALL  SMK  (311R0vr,  KEYPH,  lERR) 
95    CALL  FNM  (3HR0W,  lERR) 
IK  (lERR)  100,  llO,  999 

FIND  NEXT  VALUE  IN  PIVOl'  COLUMi 
100    CALL  SMK  (3KC0L,  KEYC,  lERR) 

105  CALL  niM  (3HC0L,  lERR) 

106  IF  (lEPvB)  no,  30,  999 

nNALLY,  UIDATE  PIVOT  ROW  ITSELF 
110    CALL  SOK  (3HR0W,  KEYB,  lERR ) 
120    CALL  GKM  (3HRCW,  KEYC,  lERR) 
IF  (KEYC.  EQ.  KEYP)  GO  TO  130 
CALL  GFM  (3HVAL,  3HR0W,  X,  IERB) 
CALL  SfX  (3HVAL,  3HR0W,  X/PIV,  lERR) 
130    CALL  niK  (3HB0W,  lERR ) 
IF  (lERR)  II4O,  120,  999 


B!RORS 

999  CAU  EER  (lERR,  3,  0) 
CALL  ABORT 


lllO  RETURN 


Flcure  V.     (continuation)  . 

mand  is  a  statement  in  terms  of  the  logical  struc- 
ture and  it  automatically  executes  the  cumbersome 
physical  details  implied  in  that  statement. 

PRACTICAL  CONSIDERATION 

It  is  not  suggested  that  the  foregoing  subrou- 
tine be  used  in  a  practical  sense.    Since  the  tone 
of  this  paper  is  basically  tutorial,  we  have  chosen 
the  familiar  pivot  operation  as  a  vehicle  for  con- 
veying the  fundamental  ideas  and  methods  of  data 
structuring  and  data  manipulation.    The  example 
serves  to  illustrate  all  of  the  DML  commands  which 


are  typically  used  in  process  of  data  base  manage- 
ment.   A  major  difficulty  with  the  simplex  method 
is  that  an  originally  sparse  matrix  is  usually  al- 
tered such  that  this  sparseness  vanishes.    We  there- 
by lose  the  benefits  a  data  base  management  system 
can  provide  in  terms  of  storing  and  manipulating 
sparse  matricies. 

p^om  a  practical  standpoint  the  revised  sim- 
plex method  furnishes  certain  well-known  computa- 
tional advantages  [7].    The  initial  problem  descrip- 
tion is  not  subject  to  modification;  rather  than 
explicitly  pivoting,  we  successively  update  the  in- 
verse of  each  new  basis.    The  current  bacis  at  any 
step  may  be  computed  by  taking  the  product  of  ele- 
mentary matricies,  were  each  elementary  matrix  is 
the  identity  except  for  one  column.    Thus  a  data 
structure  capable  of  supporting  the  revised  simplex 
method  must  be  able  to  store  both  the  original  prob- 
lem matrix  and  the  non-trivial  columns  of  elemen- 
tary matricies  which  compose  the  product  form  of 
the  basis  inverse. 

The  structure  of  Figure  h  suffices  for  the 
original  problem.    Recall  tb-at  no  storage  is  allo- 
cated for  zero  valued  coefficients,  so  substantial 
storage  savings  are  reaJ-ized  as  problems  tend 
toward  sparseness.    There  is  also  a  savings  in  pro- 
cessing; if  a  coefficient  is  not  stored  then  it  can- 
not be  considered  for  processing.    These  savings 
result  regardless  of  whether  there  exists  a  special 
pattern  of  sparseness  (e.g.  band  matricies)  or  whe- 
ther there  is  no  discernable  pattern.    That  is,  the 
savings  depend  only  upon  the  percentage  of  zero  ele- 
ments and  not  upon  their  arrangement. 

There  are  several  ways  to  store  columns  of  the 
elementary  matricies.    In  situations  where  it  is 
undesirable  or  not  possible  to  store  these  in  core, 
a  data  structure  similar  to  that  of  Figure  8  could 
be  used.    As  the  structure  indicates,  an  occurrence 
of  PROBLEM  has  many  occurrences  of  EXPRESSION,  of 
VARIABLE,  and  of  INVERSE  associated  with  it.  The 
former  two  describe  the  initial  matrix,  which  is 
not  subject  to  change.    At  the  beginning  there  are 
no  occurrences  of  INVERSE;  however  one  occurrence 
is  created  and  added  to  the  set  ERODUCT"at  each 
iteration  of  the  revised  simplex  algorithm.  In 
the  DDL  the  data  item  type  COLUMN-VALUE  is  defined 
to  be  a  vector.    So  a  DML  reference  to  an  occur- 
rence of  this  item  will  involve  an  entire  vector  of 
column  values.     For  instance  the  command  GFM 
(1+HCOLV,  UHPROD,  DATA,  IERR)  will  fill  the  array 
DATA  with  the  data  vector  of  values  from  COLV  for 
the  current  member  of  the  set  PROD.    The  position 
of  a  given  column  in  its  elementary  matrix  is 
stored  as  the  value  of  COLUMN-NO.    Finally  we  must 
be  able  to  access  the  columns  in  the  order  in  which 
tteir  respective  elementary  matricies  were  generated. 
Since  we  can  declare  (in  the  DDL)  an  ordering  for 
the  members  of  a  set,  we  specify  PRODUCT  to  be  or- 
dered on  a  FIFO  basis  (First-In-First-Out).  Thus 
access  to  occurrences  of  INVERSE  for  a  particular 
occurrence  of  PROBLEM  is  based  upon  the    order  in 
which  those  occurrences  were  originally  added  to 
the  PRODUCT  set.    It  should  be  clear  that  the  DML 
commands  used  for  implementation  of  the  revised 
simplex  method  in  a  data  base  context  are  the  same  j 
as  those  presented  in  the  pivot  example  (except 
for  DRM,  which  is  not  needed  for  the  revised  sim- 
plex method ) . 

We  realize  of  course  that  the  revised  simplex] 
and  the  product  form  are  well-known,  and  that  num- 
erous specialized  data  packing  schemes  have  been 
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developed  for  treating  spaxse  matrlcies.    What  is 
interesting  is  the  independent  development  of  data 
handling  mechanisms  which  are,  in  a  sense,  general 
packing  schemes.    The  term  'general'  refers  to  the 
ability  of  a  single  data  base  management  system 
(e.g.,  GPLAN)  to  support  data  (numeric  and  non-num- 
eric) storage  and  analysis  for  a  broad  variety  ap- 
plication areas.    Past  and  present  projects  involv- 
ing the  GPLAN  data  management  system  include  the 
areas  of  water  quality  control  and  planning,  fore- 
stry, health  care,  material  requirements  planning, 
PERT  management,  and  socio-economic  research.  Each 
area  requires  data  structures  that  embody  that 
area's  relevant  chsiracteristics .    Typically,  each 
area  also  requires  that  several  types  of  analysis 
be  performable.    In  the  water  quality  area  for  in- 
stance, analyses  of  the  following  types  have  been 
found  to  be  useful:    simulations,  linear  program- 
ming, non-linear  programming,  and  various  statisti- 
cal analyses  [8].    Programs  to  perform  any  of  these 
may  be  implemented  with  the  DML. 
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Figure  8.    possible  Data  Structure  for  the  Revised  Simplex 
Method  with  Product  form  of  the  Basic  Inverse. 


We  have  indicated  in  the  foregoing  discussion 
aow  data  base  management  techniques  may  be  used  by 
programmers  in  LP  development,  in  such  a  way  that 
2fficiencies  of  special  data  packing  schemes  are 
realized.    Similarly  this  same  data  manipulation 
Language  may  be  used  to  accomplish  any  other  pro- 
jrajranable  analysis.    Thus  data  base  management 
sools  allow  a  single  data  base ,  containing  both 
lumeric  and  non-numeric  data,  to  support  requests 
3f  various  users  for  a  variety  of  analyses  and  re- 
ports.   An  LP  package  is  insufficient  for  this  sort 
5f  task,  although  it  may  be  quite  efficient  in  what 
.t  does  do.    On  the  other  hand,  an  installation 
/hich  possesses  a  data  management  system,  can  devise 
.ts  own  LP  routines  without  regard  to  special  pack- 
.ng  schemes  and  buffering  procedures  for  interface 
<ith  auxiliary  memory.    These  are  handled  automa- 
tically by  the  data  management  system,  and  its  gen- 
:ral  packing  scheme  provides  substantial  savings  in 
;he  event  of  sparseness.    Another  important  feature 
)f  the  data  base  management  system  is  its  capacity 
;o  support  a  query  system  that  allows  non-program- 
aers  to  utilize  a  data  base  and  analytic  routines. 


QUERY  LANGUAGE 

Whereas  GPLAN/DML  requires  that  a  user  write 
programs  in  a  host  language  with  the  utilization 
of  pertinent  DML  commands,  the  GPLAN  Query  System 
does  not  require  one  to  be  a  programmer  in  order  to 
utilize  the  data  base  for  purposes  of  display  or 
execution  of  large  application  routines.    The  user 
needs  merely  to  specify  what  is  to  be  done;  there 
is  no  statement  of  the  procedures  to  be  followed  in 
order  to  accomplish  the  task.    Examples  of  very  sim- 
ple commands  are 

LIST  COEFFICIENT. SOURCE  FOR  VARIABLE. ID  =  'XI'  and 
CONSTRAINT. ID  =  'R3 ' 

LIST  VARIABLE. ID  AND  CONSTRAINT . ID  FOR  COEFFICIENT. 
SOURCE  =  'TEST  1' 

Upon  receipt  of  such  commands,  the  query  system 
analyzes  the  request,  sets  up  the  necessary  DML  com- 
mands, executes  those  commands  and  supplies  the  re- 
quested data  values.    The  systeLi  is  designed  such 
that  it  permits  the  selective  (or  unconditional) 
retrieval  of  any  data  configuration.    Moreover,  it 
permits  execution  of  application  routines  using  any 
desired  (and  germane)  data  from  the  data  base.  The 
fundamental  query  syntax  is 

<COMMAND>  <RETRmAL  CLAUSE>  <CONDITIONAL  CLAUSE> 

The  command  indicates  which  application  routine  is 
to  be  executed;  in  the  queries  above,  LIST  indicates 
that  a  report  generator  is  to  be  executed.    In  the 
retrieval  clause  the  user  specifies  what  data  are 
to  be  used  for  execution;  this  retrieval  is  depen- 
dent on  conditions  specified  in  the  conditional 
clause . 

A  user  of  the  query  language  is  allowed  to  pre- 
sent arbitrarily  complex  retrieval  clauses.  Not 
only  may  this  clause  contain  the  names  of  data  items 
to  be  retrieved,  but  arithnatic  operations  (using 
literals  or  data  items )  and  both  single  and  multi- 
variate functions  may  also  be  introduced.    The  con- 
ditional clause  is  composed  of  a  Boolean  expression 
which  may  contain  data  item  names,  literals,  arith- 
metic operators,  relational  operators,  logical  op- 
erators, single-variable  functions  and  multivariate 
functions.    The  query  language  also  permits  the 
use  of  noise  words,  synonyms  and  various  other  cos- 
metic features  for  the  convenience  of  the  user. 

A  conceptual  overview  of  the  standard  GPLAN 
system  is  portrayed  in  Figure  9.    The  library  of 
application  routines  is  composed  of  two  sections: 
standard  routines  and  special  routines.    The  stand- 
ard library  of  applications  consists  of  routines 
to  generate  reports  and  plots  and  to  perform  linear 
regressions,  statistical  analyses,  and  data  modifi- 
cation.   The  library  of  special  applications  may 
include  such  routines  as  special  report  generators, 
linear  and  non-linear  optimization  programs.  The 
issue  of  interfacing  such  optimization  routines 
with  a  network  data  base  is  discussed  in  {_h^.  One 
method  of  interface  has  been  demonstrated  in  the 
pivot  example. 

DATA  HANDLING  FOR  MATHEMATICAL  PROGRAMMING 

In  this  section  we  examine  the  implications  of 
data  base  management  techniques  for  those  who  im- 
plement mathematical  prograjmning  algorithms  and  for 
those  who  make  use  of  such  implementations.  In 
so  doing,  we  utilize  the  distinctive  GPLAN  features 
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Figure  9.    OPUU)  SyEtem 


of  the  network  data  base  structure ,  .the  language 
for  non-programmer  interface  with  a  data  base  (DML) 
and  the  query  language  that  allows  non-programmers 
to  effectively  use  a  data  base  and  pertinent  appli- 
cation routines.    Not  only  does  the  GPLAN  Frame- 
work allow  for  the  obvious,  i.e.,  the  solution  of 
linear  and  non-linear  optimization  problems;  it  also 
addresses  the  following  considerations,  which  may 
perhaps  be  more  subtle ,  but  are  certainly  of  prac- 
tical significance  [5]. 

1.  The  resolution  of  erroneous  formulations. 

2.  Treatment  of  coefficients  which  are  them- 
selves functions. 

3.  Situations  wherein  matrices  contain  com- 
mon data. 

k.    Storage  of  sparse  matrices. 

5.  Utilization  of  data  to  produce  timely, 
non-routine  reports  other  than  the  report 
furnished  by  a  general  mathematical  pro- 
gramming routine. 

6.  Ability  of  a  data  base  to  support  other 
varieties  of  application  routines  (e.g., 
simulations,  regressions,  etc.)  in  addi- 
tion to  mathematical  programming. 

The  modus  operandi  for  effective  accomoda- 
tion of  each  of  these  attributes  may  be  found  in 
[U"].    Only  a  very  brief  treatment  can  be  given  here. 
With  respect  to  the  first  point,  when  erroneous  co- 
efficients or  improper  formulation  is  suspected  the 
ready  availability  of  information  with  regard  to  co- 
efficient sources  and  currency  is  important.  This 
has  obvious  implications  for  the  way  in  which  data 
is  organized  and  retrieved.    In  this  connection  the 
query  system  provides  a  convenient  tool  for  data 
editing. 

Concerning  the  second  attribute,  it  is  not  un- 
common that  coefficients  are  the  results  of  fvmc- 
tions  that  have  been  evaluated  on  the  basis  of  some 
other  data.    There  may  even  be  alternate  functional 


forms  (or  alternate  data  sets  for  evaluating  a  func 
tion)  which  suggests  the  need  for  a  facile  method 
of  interfacing  desired  functions  with  the  math  pro- 
gramming routines  which  depend  upon  them.  Indeed 
a  lineajr  programming  routine  may  be  viewed  as  a 
function  that  requires  (recurring)  evaluation  in 
order  to  execute  a  non-linear  programming  routine 
[6],     One  technique  for  handling  this  situation  in- 
volves an  elimination  of  the  distinction  between 
data  and  function,  with  respect  to  data  base  con- 
struction; i.e. ,  functions  aje  treated  as  data  and 
included  in  the  data  base  structure . 
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Figure  10.    Extended  Logical  Data  Structure 


Attribute  number  three  is  important  for  cases 
where  there  is  interest  in  several  matrices,  which 
ajre  not  entirely  distinct  with  respect  to  con- 
straints and  variables.    For  e>cample,  we  may  be  in- 
vestigating a  problem  which  has  multiple  plausible 
formulations,  some  pairs  of  which  share  constraintsi 
(and  therefore  variable^.    Care  must  be  taken  to 
assure  that  updates  to  constraints  in  one  matrix 
are  reflected  in  other  matrices  which  share  these 
constraints;  this  is  not  a  trivial  matter  where 
large  volumes  of  data  are  involved.    Data  struc- 
ture that  can  support  this  situation  is  given  in 
Figure  10.    This  structure  allows  us  to  store  a 
constraint  only  once,  and  at  the  same  time  indicate] 
that  it  is  to  be  included  in  an  arbitrary  number 
or  matrices.    This  avoidance  of  redundancy  averts 
the  potential  for  inconsistencies.    A  detailed 
description  of  this  structure  at  the  occurrence 
level  may  be  found  [U],    Figure  11  displays  verba- 
tim examples  of  the  kinds  of  queries  which  could 
be  submitted,  if  the  data  structure  is  as  shown  in 
Figure  10.    The  first  three  queries  show  how  que- 
ries may  be  phrased  to  locate  and  rectify  errors  ir| 
the  data.    The  next  two  queries  demonstrate  ways  ir 
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.ch  are  LP  problems  may  be  formulated  for  execu- 
in.    The  final  query  asks  for  basic  statistics  on 
\  coefficient  values  of  variable  U97  in  matrix  ?• 

r  CONSTRAINT. ID,  COEmCIENT.VAME  AND  COEFFICIIUT. SOURCE  TOR  VARIABI^.ID  =  ISW, 
fflCIENT. VALUE  <  1.0  OR  COQTICIEIJT.VALUE  >  9-327,  AMD  MATRIX. ID  =  h 

KS  COKFFICIBtT.VALUE  TO  8.27  WEN  VARIABLE. ID  =  13l*7,  CONSTRAINT.  ID  =  39, 
HATRDC-ID  =  "l 

SGE  COEFTICIENT. VALUE  TO  LOG  (COEFFICIENT. VALUE)*!. lU  IF  VARIABLE.ID  =  58 
MATRIX. ID  =  21 

IP  IDR  MATRIX.  ID  =  17  AND  OBJ-FUN-ID  =  5 
it  FOR  PROBLEM.  ID  =  38 
r  COKFFICIENT.VALUE  FOR  VARIABLE.ID  =  '197  AMD  MATRIX. ID  =  7 
Figure  11.    Sample  Queries 

Many  real -world  applications  entail  the  utili- 
;ion  of  sparse  matrices.    In  the  effort  to  avoid 
(ring  zeros,  many  schemes  have  been  devised  for 
;king  (ajid  unpacking)  non-zero  coefficients 'into 
rays.    As  already  shown  in  the  pivot  example,  the 
,AN  concept  allows  all  matrices  (sparse  or  other- 
j;e)  to  be  accomodated  by  a  single  simple  logical 
'ucture,  which  realizes  substantial  storage  sav- 
;s  if  a  matrix  happens  to  be  sparse.    Only  non- 
•o  coefficients  need  to  be  stored;  and  storage 
ice  is  neither  used  nor  even  allocated  for  zero 
if f icients . 

Presumably  managerial  decisions  are  not  based 
i.ey  upon  the  output  of  mathematical  programming 
iitines.    The  fifth  consideration  indicates  the 
jid  for  a  facility  to  generate  other  reports 
i)m  a  data  base  and  frequently  these  are  non-stan- 
|*d  in  terms  of  the  types  and  configurations  of 
!;a  that  are  retrieved.    The  GPLAN  query  system 
i.ows  the  selective  retrieval  of  any  configrjration 
data  as  a  result  of  typing  an  English-like, 
i-procedural  query  at  a  computer  terminal.  This 
riates  the  necessity  of  writing  a  program  every 
le  a  new  tjrpe  of  report  is  needed.      This  con- 
st can  be  extended  to  include  not  only  retrieval, 
;  also  the  execution  of  large  application  rou 
les  that  are  not  of  the  mathematical  programming 
•iety.    Furthermore,  such  executions  can  be  accom- 
Lshed  through  the  query  language.    The  result  is 
situation  wherein  a  network  data  base  can  support 
3road  spectrum  of  analyses  for  both  programmers 

1  non-programming  users. 

TCLUSION 

The  major  intent  has  been  to  introduce  fUnda- 
ital  concepts  from  the  realm  of  data  base  manage - 
it  and  to  suggest  their  potential  for  contribution 

the  solution  of  mathematical  programming  problems. 
Ls  contribution  is  considered  with  respect  to  both 
;  implementors  and  users  of  mathematical  program- 
ig  algorithms.    GPLAN,  a  generalized  data  base  man- 
jment  and  query  system,  was  described.    It  is 
sential  to  observe  that  there  is  invariably  a 
ideoff  between  specialized  and  general  systems. 
;  former  are  generally  more  efficient  due  to  their 
nited  flexibility;  the  latter  axe  typically  not 

efficient  due  to  their  great  flexibility.  Both 
efficiency  and  inflexibility  have  costs.  GPLAN 
Dvides  a  single  mechanism  which  may  be  used  to 
sat  the  spectrum  of  special  cases  and  to  support 
mriety  of  applications  and  users  working  with 

2  same  data  base.    As  an  exercise,  this  general 
stem  could  be  tailored  to  meet  only  special 


needs,  in  circumstances  where  flexibility  is  unim- 
portajit  and  efficiency  is  paramount. 

HistoricaJJLy ,  the  trend  has  been  for  each  math 
programming  implementor  to,  in  effect,  devise  data 
management  routines.    We  submit  that  advances  (both 
past  and  future )  in  the  data  base  management  field 
can  provide  valuable  concepts,  techniques  and  tools 
for  the  data  handling  aspects  of  mathematical  pro- 
gramming. 
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Abstract .     Recently,  modifications  of  the  method 
of  multipliers  have  been  proposed  which  do  not 
require  exact  minimization  in  the  intermediate 
unconstrained  problem.     It  can  be  shown  that  any 
Lagrange  multiplier  updating  scheme  which  yields 
quadratic  convergence  with  exact  minimization 
retains  this  rate  if  only  two  steps  of  a  Newton 
iteration  are  taken.     It  can  also  be  shown  that  a 
quasi-Newton  iteration  in  conjunction  with  the 
multiplier  update  proposed  by  Tapia  yields  a 
superlinearly  convergent  algorithm.     If  the 
penalty  constant  is  allower  to  increase  to 
infinity  at  a  certain  rate,  simpler  multiplier 
update  formulas  will  exhibit  good  convergence 
rates . 

Recently  there  has  been  much  research  in 
solution  of  constrained  optimization  problems  by 
the  method  of  multipliers.     This  method  was 
first  proposed  by  Hestenes    [1]   and  by  Powell  [2] 
independently  in  1969.     It  involves  solving  the 
problem 


minimize  f(x) 
such  that    g(x)  =  0 

n         1  n  m 

where     f :     R    ->-  R    and    g:     r    ^  R      by  suc- 
cessively minimizing  the  augmented  Lagrangian 

L(x,X,c)=f(x)  +\'^g  (x)  +hcg{x)  '^g  (x) 

in     X.     If     X*     is  the  solution  of  problem  (1) 
then  there  exists  a  unique     A*€  R*    such  that 
V  L(x,*,A*,c)  =0,     where     V      we  mean  the 

X  X 

gradient  taken  with  respect  to  the  variable  x. 
We  will  assume  throughout  that    m    <    n,  that 
Vg(x*)     is  of  full  rank  and  that  V^L(x*,A*,c) 
is  invertible.     It  can  be  shown.  Buys    [3]  that 
there  exists  a  constant    c°     such  that  for 
c  >  c  ,  X*     is  an  unconstrained  local  minimum 
of    L(x*,A*,c)     and  for     X     in  a  neighborhood 
of     X*,     L(x,A,c)     has  a  local  minimum  in  x. 


The  method  of  multipliers  precedes  as 
follows : 


Step  Choose  an  initial  approximation 

A       to     A*,     and  an  initial  c  . 


Step  2:     Choose     x^^     tg  be  the  minimizer 
of    L(x, A  ,  c  ) , 


(1) 


Step  3:  Let 


k+1 


^,  k  ,k  k, 
P (x  , A   ,c  ) 


,k+l  k    k  k+1 

A         =  U{x   , A   ,c  ) 


Step  4:     Return  to  step  3  with    k  =  k+1. 


The  function     U    may  be  referred  to  as  a  multi- 
plier update  formula.     The  function    P     is  some- 
times chosen  simply  so  that    c'^     is  large  enough 
that  a  minimizer  of    L    exists,  while  for  other 
schemes     c'^     tends  to  infinity  near  the  solution. 

A  number  of  multiplier  update  formulas  have 
been  proposed.     We  consider  the  following: 


U^p(x,A,c)   =  A+cg(x) 


(2) 


U„(x,A,c)   =  A+[Vg(x)'^V^^L{x,A,c)Vg(x)]  ^g(x) 

(3) 

U„(x,A,c)   =  -[Vg(x)'^Vg(x)  ]"'''Vg(x)'^Vf  (x)  (4) 

U^(x,A,c)   =   [Vg(x)  VL(x,A,c)'-'"Vg(x)]~-^ 

[g(x)   -  Vg(x)'^V^^L(x,A,c)~"'-Vf  (x)  ]  -cg(k) 

(5) 


one  originally  proposed  ta 
Rupp   [4] ,  and  Bertsekas 


Formula   (2)    is  th 
Hestenes  and  by  Powell 

[5]  have  shown  that  for    c*^    sufficiently  large 
the  Hestenes-Powell  update  gives  arbitrarily 
good  linear  convergence.     Formula   (3)  was  propose 
by  Buys   [3]  who  showed  that,  if    x'^     is  considere 
as  a  function  of     A,      (3)   is  equivalent  to  one 
Newton  iteration  on  the  problem     g(x(A))   =  0, 
and  this  gives  quadratic  convergence.  Formula 

(4)   is  just  the  projection  of    Vf    onto  the 
linear  span  of    Vg^  (x)  ,  .  . .  ,  Vg^^^  (x)  .     Its  use  has 
been  proposed  by  Haarhoff  and  Buys   [6] ,  Miele 
et  at   [7] ,   and  Fletcher    [8] .     Formula   (5)  was 
originally  proposed  by  Tapia   [9]   as  a  consequence 
of  a  general  theory  for  solving  constrained 
problems . 


,cN 


Finding    x      which  minimizes  L(x,A 
exactly  is  of  course  impossible  and  even  finding 
a  good  approximation  may  be  very  costly.     It  is 
natural  to  consider  algorithms  involving  simple 
approximatations  to  the  minimizer. 


m 


Bertsekas    [5]   considers  an  implementation 


I 
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with  the  Hestenes-Powell  update  where  the  minimi- 
zation is  carried  out  until     ||  V  l||    is  less  than 
some  tolerance,  and  shows  that  for  the  tolerance 
correctly  chosen  one  may  obtain  convergence  as 
good  as  in  the  case  of  exact  minimization  although 
nothing  may  be  said  about  how  difficult  that 
tolerance  is  to  achieve  for  large  penalty 
constants . 

Another  strategy  is  to  take  a  small  number  of 
iterations  of  a  minimization  algorithm,  a  strategy 
referred  to  by  Tapia  as  a  diagonalized  multiplier 
method   [9] .     Such  a  strategy  has  also  been  pro- 
posed by  Haarhoff  and  Buys    [7]  and  implemented 
by  Miele  et  at   [6] . 

We  consider  first  the  case  of  algorithms 
implemented  with  bounded  penalty  constants  i .e . 
c'^   <    c    for  all    k.     If  a  multiplier  update 
formula  such  as  that  of  Buys  is  used  one  needs 
second  order  information  about  the  functions 
involved  so  it  is  reasonable  to  consider  methods 
in  the  unconstrained  minimization  phase  which 
generate  and  use  the  second  derivative  or  an 
approximation  to  it. 


k+1 


We  will  refer  to  an  algorithm  where  the  point 


X  is  determined  by     j    steps  of  some  iterative 

algorithm  as  a     j     step  diagonalization . 

Based  on  different  considerations  Tapia 
[9,10,11]  proposed  an  algorithm  which  is  equiva- 
lent to  a  one-step  diagonalization  using  Newton's 


method.     That  is,   the  point    x"^    is  determined  by 

,c  )1     V  L(x       ,X  ,c  ) 

— '  V 


V 


and  A  is  chosen  according  to  Tapia' s  update: 
X'^  =  U(x'^~"'",X,c'^)  where  X  =  U  (x'^"^)  .  Tapia 
shows  this  method  has  the  followSng  convergence 
properties    (11) . 

Proposition  1 .     If  the  one-step  diagonalized 
method  of  multipliers  is  implemented  with  T^pia's 
update  and  Newton's  method,  the  sequence  {x  } 
is  uniformly  quadratically  convergent  to  the 
solution  x*. 


update  behaves  under  diagonalization.     If  we 
consider  the  question  of  quadratic  convergence  in 
the  space  of  both  variables ,     x  and  X    we  get  the 
following  result. 

Proposition  2.     Consider  the  method  of  multiplier 
implemented  with  a  given  multiplier  update  formul 
and  with  bounded  penalty  constants.  Uniform 
quadratic  convergence  in     (x,X)     occurs  with 
exact  minimization  if  and  only  if  it  occurs  with 
a  two-step  diagonalization  using  Newton's  method. 

Proof.     See  [12]. 

Since  both  the  Tapia  update  and  the  Buys 
update  give  quadratic  convergence  for  exact 
minimization,  it  follows  then  that  one  may  get 
the  same  convergence  for  a  two-step  diagonali- 
zation with  Newton's  method,  and  for  such  update 
it  seems  nothing  is  contributed  to  convergence 
rate  by  further  iterations. 

Since  using  Newton's  method  required 
knowledge  of  the  second  derivatives  of  the 
finctions  involved  and  requires  a  large  number 
of  function  evaluations  at  each  step  it  would  be 
advantageous  if  a  quasi-Newton  method  could  be 
used  in  place  of  Newton's  method  in  the  deter- 
mination of    x'^.     If  this  were  done  with  one  of 
the  quadratically  convergent  updates  such  as  that 
of  Tapia  or  Buys  it  would  be  necessary  to  use  an 
approximation  to    V^L(x,X,c)     in  the  multiplier 
update  formula.     Consider  then  the  following 
iteration : 


,k+l 


k+1 


k+1 


'k+l 


=  1_  Vg(x'^)Bj^-^Vg(x'^)J  ''■LgCx'^) 

-  Vglx'^j^B^Vf  (x'^)]  -c^gix") 

k      -1„     ,  k    k+1  k, 
x-B     VL(x,X  ,c) 
k  x 

„^  k+1      k  „  ^ ,  k+1    k+1  k^ 
Vx        -x,VL(x       ,X  ,c) 
^  X 

X  k 

^,  k+1  ,k+l  k, 
P(x       ,X       ,c  ) 


(6) 


This  result  is  obtained  by  noting  that 

Tapia' s  algorithm  is  essentially  equivalent  to 

using  Newton's  method  on  the  problem 

V  L(x,X,c)  =  0,  g(x)  =  0. 
x 

By  uniform  quadratic  convergence  of  an 
algorithm  we  mean  there  exist  constants 
M,  V  >  0    such  that  if     ||  x°  -  x*||    <  v,  then 
,k+l  _  V*  II    <:     M   II  ^'^  _  X*  li 


M 


for  all  k. 


The  notion  of  uniform  quadratic  convergence 
is  slightly  stronger  than  Q-quadratic  convergence, 
but  almost  all  iterations  which  are  Q-quadrat- 
ically  convergent  are  uniformly  quadratically 
convergent    Using  the  projection  formula  for  X 
allows  us  to  get  uniform  quadratic  conver- 
gence.    If  we  let  X 
quadratic  convergence  of     {x  } . 


x'^  ■'■  j^we  only  get  R- 


Since  both  the  Buys  multiplier  update 
formula  and  the  Tapia  formula  give  quadratic 
convergence  when    x'^    is  obtained  by  exact 
minimization  it  is  natural  to  ask  how  the  Buys 


The  function     V(s,y,B)     is  the  Hessian  update 
function  for  some  quasi  Newton  method.  The 
analysis  of  such  an  algorithm  is  somewhat 
different  for  each  possible  Hessian  update 
function  but  Tapia  has  proved  the  following 
result  for  standard  quasi-Newton  methods    [11] . 

Proposition  3.     Consider  the  method  of  multi- 
pliers implemented  as  in   (6)   with  bounded  penalty 
constants  and  with  the  function    V    being  any  one 
of  the  Broyden,  Davidon-Fletcher-Powell ,  Broyden- 
Fletcher-Goldf arb-Shanno  or  Powell  Symmetric 
Broyden,  Hessian  update  functions.     The  sequence 
of  points     {x'^,X'^)     will  then  be  locally  and 
superlinearly  convergent  to     (x*,X*) . 

This  result  shows  that  the  algorithm 
ourlined  in  (6)   is  a  relatively  efficient 
strategy.     It  is  possible  that  there  would  be 
some  advantage  in  taking  morej^than  one  quasi- 
Newton  step  in  determining    x  ,     however  by  this 
result  the  convergence  rate  would  still  be 
superlinear . 
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It  seems  likely  that  if  a  quasi-Newton 
method  were  used  with  the  Buys  update  in  some 
diagonalized  scheme  that  superlinear  convergence 
would  result.     However,  it  has  not  been  possible 
to  prove  this  yet.     It  is  interesting  to  note 
that  according  to  the  above  result,  the  matrix 
Bj.    works  well  as  an  approximation  to  the  Hessian 
in  the  multiplier  update  formula  even  though  it 
is  known  that    Bj^    does  not  converge  to  the 
Hessian  in  general. 

Up  to  this  point  we  have  been  concerned 
with  algorithms  where  the  penalty  constant  is 
bounded  throughout  the  iteration.  However, 
it  is  well  known  that  in  some  cases  the  rate  of 
convergence  is  improved  as  the  penalty  constant 
goes  to  infinity.     As  mentioned  prefiously 
Bertsekas   [5]  shows  arbitrarily  first  linear 
convergence  for  exact  and  approximate  mimimization 
in  these  cases.     Miele  et  al   [6]  have  done 
numerical  experiments  involving  the  use  of  the 
projection  update   (equation  5)   in  various  diag- 
onalized schemes  using  Newton's  methods  and 
quasi-Newton  methods .     By  increasing  the  penalty 
constant  at  the  appropriate  rate  they  get  con- 
vergence which  appears  R- superlinear . 

In  addition  to  the  projection  update  we 
consider  multiplier  update  formulas     U(x,X,c)  with 
the  property 

U(x*,A*,c)   =  0. 

We  will  refer  to  such  formulas  as  local  multiplier 
approximation  formulas.     Both  the  projection 
update  and  the  Tapia  update  fall  into  this  class. 
Note  that  it  is  essential  that  such  a  formula 
have  the  property    U(x*,X*,c)   =  A*. 

In  the  case  of  exact  minimization,  as  the 
penalty  constant    c     goes  to  infinity  the 
various  updates  mentioned  tend  to  merge  to- 
gether and  all  give  good  convergence.     In  the 
case  of  local  approximation  formulas,   the  author 
shows  in^[l3]^  that  if    c'^     is  chosen  to  be 
yil  g(x      )  II    "  where    y  >  o    and     0  <  a  <  1  the 
iteration  is  locally  convergent  with  Q-order 
1  +  a. 

However  if  such  a  scheme  for  choosing 
penalty  constants  is  used  in  a  one-step 
diagonalization  with  Newton's  method,  con- 
vergence becomes  slower  when    a    is  close  to 
one.     If    a  =  1    the  method  is  very  slow  and 
may  not  converge  at  all .     The  problem  is  that 
when  the  penalty  constant  increases  too  fast 
the  Newton  iteration  becomes  unstable.     This  slow 
convergence  occurs  even  if  the  normally  qua- 
dratically  convergent  Tapia  update  is  used. 

These  difficulties  may  be  avoided  in  a 
way  leading  to  efficient  algorithms  in  two 
ways.     If  a  two  step  diagonalization  is  used  one 
gets  again  the  same  convergence  rate  as  with 
exact  minimization.     At  best  with    a  =  1  the 
method  is  quadratically  convergent.  Alternatively 
a  one  step  diagonalization  may  be  used  with 
a  =  h     in  this  case  the  order  of  convergence  is  1^. 
This  can  be  shown  to  be  the  optimum  value  of  a 
for  a  one  step  diagonalization   [13] . 


The  above  development  seems  to  indicate  tha< 
a  one  or  two  step  diagonalization  is  as  effective 
as  exact  minimization  in  ensuring  fast  con- 
vergence .     Alternatively  if  one  is  using  a 
scheme  demanding  a  certain  degree  of  exactness 
in  the  minimizer  these  results  give  some 
indication  of  how  much  effort  is  required  to 
achieve  this  exactness. 

BIBLIOGRAPHY 

1.  Hestenes,  M.R.,   "Multiplier  and  gradient 

me thods , "  Journal  of  Optimization  Theory  anc 
Applications ,  Vol  4,  pp.   303-320,  1969. 

2.  Powell,  M.J.D.,   "A  method  for  nonlinear 
constraints  in  minimization  problems,"  in 
Optimization ,  Edited  by  R.  Fletcher,  Academi 
Press,  London,   England,  1969. 

3.  Buys,  J.D.,   "Dual  algorithms  for  constrained 
optimization,"  Ph.D.  thesis,  Rijksuni- 
versiteit  te  Leiden,  the  Netherlands,  1972. 

4.  Rupp,  R.D.,   "On  the  combination  of  the  multi 
plier  method  of  Hestenes  and  Powell  with 
Newton's  method,"  Journal  of  Optimization 
Theory  and  Applications,  Vol.  15,  pp.  167- 
187,  1975. 

5.  Bertsekas,  D.P.,   "Combined  primal  dual  and 
penalty  methods  for  constrained  minimi- 
zation," SIAM  Journal  on  Control,  Vol.  13, 
pp.  521-543,  1975. 

6.  Miele,  A.,  Moseley,  P.E.,  Levy,  A.V.,  and 
Coggins ,  G.M.,   "On  the  method  of  multi- 
pliers for  mathematical  programming 
problems,"  Journal  of  Optimization  Theory 
and  Applications ,  Vol.  10,  pp.  1-33,  1972. 

7.  Haarhoff,  P.D.,  and  Buys,  J.D.,  "A  new  me the 
for  the  optimization  of  a  nonlinear  functior, 
subject  to  nonlinear  constraints,  "Computinc 
Journal,  Vol.   13,  pp.  178-184,  1970. 

8.  Fletcher,  R. ,     "An  ideal  penalty  function 
for  constrained  optimization,"  Journal  of 
the  Institute  of  Mathematics  and  its 
Applications,  Vol.  15,  pp.  319-342,  1975 . 

9.  Tapia,  R.A.,   "Newton's  method  for  optimi- 
zation problems  with  equality  constraints,"  , 
SIAM  Journal  on  Numerical  Analysis,  Vol.  11,! 
pp.  874-886,  1974. 

10.  Tapia,  R.A.,   "Newton's  method  for  problems 
with  equality  constraints,"  SIAM  Journal  on 
Numerical  Analysis,  Vol.  11,  pp.  174-196, 
19  74. 

11.  Tapia,  R.A.,   "Diagonalized  Multiplier  method 
and  Quasi-Newton  methods  for  constrained 
optimization,"  Department  of  Mathematical 
Sciences  Technical  Report,  Rice  University, 
Houston,  Texas,  June  1975,  to  appear  in 
Journal  of  Optimization  Theory  and  Appli- 
cations . 


182 


12.  Byrd,  R.H.   "Local  Convergence  of  the  Diag- 
onalized  Method  of  Multipliers."  Submitted 
to  Journal  of  Optimization  Theory  and 
Applications . 

13.  Byrd,  R.  H.,   "Local  Convergence  of  the  Diag- 
onalized  Method  of  Multipliers,"  Ph.D. 
thesis.  Rice  University,  Houston,  Texas, 
1976. 


183 


DIRECT  APPROACHES  FOR  THE  MINIMAX  PROBLEM 


A.R.  Conn 

Department  of  Combinatorics  and  Optimization 
University  of  Waterloo 


Introduction . 

Consider  a  system  of  m  real,  twice  contin- 
uously dif f erentiable ,  and  in  general,  nonlinear, 
functions 

f.(x),  i  e  [M], 

where     [M]  =  {l,2,3...,m}  is  an  index  set.  Let 


(1) 


M  (x)  =   Max  f.(x), 
^  ie[M] 

T 

X  =    [x   ,x   , . . . ,x   ]  . 

i     z  n 

The  problem  under  consideration  is  to  locate  a 
point    Xq     such  that 

M^(Xq)  <  M^(x)  :  P 

for  all  points     x,  at  least  in  a  neighbourhood  of 

The  objective  function  M^ (x)  has  discont- 
inuous first  partial  derivatives  at  points  where 
two  or  more  of  the  functions     f_j^(x)     are  equal  to 

the  maximum.     This  is  true  even  if  the  f.(x) 

1 

have  continuous  first  partial  derivatives. 

As  an  illustration,  figure  1  shows  the 
contours  for    M^(x)  for  a  purely  linear  case,  viz. 


Exampl e  1 . 


f^  =  -  lOx^  -  2.5x2 


'2  -  - 


20x„ 


lOx^  +  20x2 


AO. 


The  points  of  discontinuity  are  indicated  in 
figure  1  by  dotted  lines. 

We  first  note  that  because  of  the  discontin- 
uities in  the  derivative  the  well-known  gradient 
type  methods  cannot  be  used  directly  to  minimize 
Mj(x) . 

Secondly,  we  note  that  an  equivalent  formul- 
ation of  the  problem    P^    is  (see  for  example  [17]) 

Minimize  z 


,5 


such  that,   z  -  f^(x)  >  0,     i  e  [M] 

where     z     is  a  new  independent  variable. 

But  in  general,  P*  is  just  a  "standard"  non- 
linear programming  problem  in  n+1  variables,  and 
can  be  solved  accordingly.  In  fact,  this  is 
exactly  the  approach  taken  by  [16]. 

Thirdly  we  note,   (as  is  easy  to  show,  on 
applying  the  well-known  Kuhn-Tucker  conditions  to 
P*)  that  a  minimax  optimum  at  x^  can  be  character- 
ized (see,   for  example  [17])  by 


Figure  1 ■     Contours  of  M^(x)  for  Example  1. 


i£E(x  ,0) 


(Vf.(x^) 


8f(V 
■   3x,  ' 


0, 


3x  ^ 


and 


where 


ieE(xQ,0) 


u.  =  1. 


u.   >  0,    1  £  E(Xq,  0), 


E(x,e)  =  {ie[M]|M^(x)  -  f^(x)  <e}. 


(23; 


(2b; 
(2c| 
(2d) 


(i.e.,  E(x,£)     is  the  index  set  of  those  functions 
that  are  within    e     of    M^ (x)     at    x) . 

We  shall  now  formulate  informally  an  approach 
to  the  minimax  problem  based  on  three  properties 
cited  above. 


Suppose  we  are  at  x 
that  f^(x  ) ,  i  =  1, . . . 


Furthermore,  suppose 
take  on  the  maximum 
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unction  value  M  (x  ).  As  in  most  optimization  ^ 
pproaches  our  aim  is  to  determine  a  direction,  d  , 

ay,  such  that    M^,  at    x  ,  decreases  in  this 

.irection.     It  then  remains  to  determine  how  far 
,o  step  in  the  chosen  direction  and  the  iteration 
s  defined.     It  is  clear  that  since  the     f  , 


^  1,. 


,q     are  less  than    M^(x  ),  locally  at 


east,  we  are  only  concerned  with  those  functions 

i  =  l,...,q.     We  shall  term  these  functions 
;iie  active  functions.  ^~ 

Let  us  suppose  we  find  a  direction    d  such 
:hat     f^,  say,  decreases.     If  it  should  happen  that 

.n  addition,  f^,...,f^    decrease,   then  we  have  a 

However,  suppose  that 
increases.     Then,  as 
will  in  fact  increase.     Thus  what 


lescent  direction  for  . 
ilthough     f^  decreases, 


f 


Ls  obvious,  M 

je  must  do  is  decrease  all  the  active  functions 
simultaneously . 

If  we  now  recall  the  non-linear  programming 
[ormulation  of  the  problem,  we  observe  that  active 
runctions  correspond  to  active  constraints  and 
that     f^     increasing,   for  example,  while     z  is 

decreasing,  amounts  to  violating  a  constraint 
(assuming     f      was  an  active  function  at  the  start 
2 

3f  the  iteration  and     z    =  M^(x  )).     Thus  a 

sufficient  condition  that  we  have  a  suitable 
lescent  direction  for  is  equivalent  to 

insuring  that  no  active  constraints  are  violated. 
[That  this  is  not  a  necessary  condition  we  will 
jee  shortly]  .     This  situation  is  a  familiar 
occurrence  in  the  context  of  mathematical 
Drogramming  (see,  for  example  [18]     and   [15])  and 
iDne  approach  to  resolving  the  difficulty  is  via 
Drojections.     [See  [8]   for  further  references  and 
ietails ] .  , 

An  elementary  exposition  on  projection 
.latrices  now  follows:     Suppose  we  have  the 

:ollowing  basis  for  e'^: 

j 

Uhere 


q  q+i 


b^a.  =  0  Vi,j. 
1  3 


;"onsider  the  following  nxn  matrix 
P  =  I 


T  -IT 
N(N  N)     N  , 


(3) 


,jhere    N  =   [a  .  .  .a  ]  ,    (i .  e. ,     N     is  an  n=<q  matrix 
if  rank  q)     and     I*^  is  the  nxn  identity  matrix. 

Pa.  =  0,  Pb.  =  b.  Vi,j 
2  J  1  1 

.'urthermore ,     P    =  P    and  we  note  that 


It  is  easily  seen  that 


;iny  vector  in  E 
space  spanned  by 


into     Q  , 
Now  suppose  we  take  P 


|>^(z  ,x) 


a . 
1 


i?here 


i)^(z,x)  =  z 


where  Q 
wi  th 

i  =  1,..., 


f  .(x) 
1 


P  takes 
denotes  the 


ind 


is  as  defined  in  equation  (15)  below, 

1 


[i.e.,  (j) .   >  0     are  the  inequality  constraints  of 
P*] .     We  note  that    a.,  and  thus     P,   is  independent 
of     z.     Furthermore,  let  us  define    d  £  \R^^^  by 


where 


d  =  -  Pe, 


[1,0,. ..,0] 


e  £  IK 


n+1 


(5) 


(6) 


Now,  in  the  case  where  the     f .     are  linear,  we 
have,  by  Taylor's  theorem 


[z+Ad  ,x+Ad 


1 

where  d 
But,  by  construction 


(z,x)  +  Ad"*"  -  A[d  I'^Vf  .(x),(7) 


[d"^:d  ],  d"^,A  €  R^, 


d"*"  -  [d  ]^Vf .  (x)  =  d'^^ 


A  >  0. 


0     i  =  1, 


Thus 


fi^[z+Ad  ,  x+Ad  ] 


More  generally,  for 
f  o".   concave     f  . 


:z,x]     i  =  l,...,q.  (8) 
=  l,...,q,  we  have  that, 


[z+Ad  ,  x+Ad  ]   >  (t>.(z,x)     i  =  1, 


(9) 


That  is,   in  the  case  where  the  active  functions  are 
linear  or  concave,  if  the  search  direction  is  given 
by  (5),  no  active  constraints  in  the  equivalent  non- 
linear programming  formulation  are  violated. 

T  2 
Furthermore,  since    e  d  =  -   |  |  Pe  |  |    <    0,    if   Pe  0 

[which  we  may  assume,  for  the  present],     d     is  a 

descent  direction  for     z.     Thus  our  objective  is 

realized,   i.e.,  we  are  reducing  . 

Referring  to  figure  1,   let  us  suppose  we  are 

at  the  point    x*^    where     f^  (x)  =  f.,(x)  =  M^(x).  The 


where  ^-^^.^z 
corresponding  descent  direction  for 


f 


M^,  d 


indicated  on  the  figure.     We  see  that  it  is  an 
ideal  choice.     In  particular,  we  see  that,  locally 
at  least,   the  indices  of  the  active  functions  do 
not  change.     This  result,  of  course,  is  an 
immediate  consequence  of  equation  (8). 

In  practice,  not  all  the  f.'s     i  =  l,...,q 

need  to  be  put  into  the  projection  and,  in 
particular,  it  is  possible  to  ensure  that    Pe  ?^  0 
and  thus  find  a  descent  direction  that  does  not 
violate  those  constraints  not  in  the  projection, 
unless  optimality  has  already  been  achieved.  That 
this  is  so  is  a  consequence  of  the  equations  (2a), 
(2b),   (2c)  and  (2d).     [A  point  corresponding  to 
such  a  situation  is  indicated  by    y    on  figure  2 
below.     That     Pe  =  0     in  this  case  is  easy  to  see 
since     f^(y)  =  f^Cy)  =  f^(y).     We  can,  in  fact. 


fjCy) 


remove    Vcfi^     from    N     in  this  case  to  obtain 
descent] . 

We  have  noted  above  that  ensuring  no  active 
constraints  are  violated  in  the  corresponding  non- 
linear programming  problem  is  a  sufficient 
condition  for  a  suitable  descent  direction  for  . 

As  we  will  now  indicate,  it  is  not  a  necessary 
condition.  Furthermore,  in  such  a  case , our  solution 
to  the  problem  of  finding  a  descent  direction  to  M 
via  the  projection  matrix    P     is  still  suitable. 

Suppose  our  active  functions  are     f  and 


f 


■2' 
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Example  2.  ^  4 

f^(x)  =  + 


f2(x)  =  +  (2-X2)^ 

f^(x)  =  2  exp  (-x^  +  x^) . 


Figure  2.       Contours  of    M  (x)     for  Example  2. 


both  convex  functions.     That  is     q  =  2, 
Mj(x)  =  f^(x)  =  f^Cx)  . 

Choose  d     as  in  (5).     The  corresponding 
result  to  (9)  is 

({>_.[ z+Xd"*",  x+Ad~]   <  (|)^(z,x)     i  =  1,2  (10) 

[since     ()> .     is  now  concave]. 

However,  by  construction 

T  + 
e  d    <  0,  i.e.  d      <  0 


and 

i.e., 

and 

i.e.  , 


dVd).(z,x)=0     1  =  1,2, 
1 


d"^  -  [d  ]^Vf_^(x)  =  0 
d"^  -  [d'l^Vf^Cx)  =  0, 


[d  I'^Vf^Cx)  =   [d  ]^Vf^(x)  =  d"^    <   0,  (11) 

or,   in  other  words,  locally  at  least,  both     f^  and 
f^    have  decreased,  i.e.,     d       is  a  descent 
direction  for  . 

Remark.     This  in  itself  indicates  that  the  approach 
we  take  here  is  more  general  than  just  solving  the 
problem  P*. 

Furthermore,  we  remark  at  this  stage  that, 
since  a  major  part  of  our  algorithm  depends  upon 
determining    P,  we  can  make  use  of  the  numerical 
results  available  for  determining  orthogonal 
projections . 


Having  determined  the  search  direction  we  noi 
that  minima  along  a  line  for  a  function  such  as 
Mj(x)     are  likely  to  occur  at  those  points  at  whic 

the  changes  in  the  first  derivative  are  discontin- 
uous  [for  example,  in  the  linear  case  this  is 
guaranteed].     Thus,  as  an  initial  estimate,  the 
linear  search  approximates  a  subset  of  these 
points  of  discontinuity.    [Specifically  that  subset 
which  corresponds  to  the  points  of  discontinuity 
given  by     f.(x)  =  f.(x)     (where     j     is  one  of  the 

J  k 
active  functions  at     x  ) 


over  all 
k. 


correspond!! 


to  inactive  functions  at  x  ] .  If  the  approxima- 
tion so  determined  gives  a  point  where  the  actual 
value  of  is  sufficiently  lower,   then  it  is 

accepted  -  the  line  search  is  completed  -  other- 
wise a  more  usual   (and  more  sophisticated)  line 
search  is  required . 

2 .       The  algorithmsof  Bartels,  Charalambous  and 
Conn  [3] ,    [5] . 

The  basic  ideas  behind  these  two  algorithms 
were  those  given  in  section  1  above.     As  originall; 
presented,   the  first  reference  considered  only 
linear  functions   [actually,  piecewise  linear  since 


Mj(x) 


Max 
i 


a  .x 

1 


and  the  second  reference 


concerned  itself  with  only  non-linear  functions  f 

] 

-  i.e.,  no  attempt  was  made  to  improve  the 
efficiency  by  making  use  of  any  linearities 
present  amongst  the  f^. 

We  will  attempt  to  combine  the  results  of  botl 
algorithms  in  one  algorithm  for  the  problem    P.  Ii 
addition,  differences  over  the  originals  will  be 
indicated,  as  well  as  relating  the  results  to  the 
work  of  others  in  the  f ield  ,in  a  later  section. 

Thus  we  will  use  the  formulation    P*  to 
determine  the  search  direction,  but  will  then  apply 
a  related  direction  direc tly  to  the  function  M^(x). 

In  the  algorithm,  each  iteration  consists  of 
one, or  sometimes    two,  directions.     The  first,  the 
one  that  is  always  present,  is  termed  the 
horizontal  direction.     It  tries  to  maintain 
constant,   the  index  set  of  near  active  functions. 
Two  or  more  functions  are  considered  near  active 
if  they  are  equal  to  the  present  maximum  up  to  a 
specified  tolerance.     In  addition,  it  is  required 
that  the  horizontal  direction  be  such  as  to 
decrease         .     This  corresponds  to  the  direction  d 

in  section  1.     The  second,   the  vertical  direction, 
amounts  to  attempting  to  satisfy  the  near  active 
functions  exactly  by  means  of  linearization.  This 
is  done  by  determining  the  least  squares  solution 
of,  in  general,  an  underdetermined  system  of 
linear  equations.     As  such  this  step  is  closely 
related,  algebraically,   to  that  of  determining  P 
and  once  again  is  a  well-worn  path  in  the  field  of 
numerical  analysis.     The  necessity  for  the  vertical 
direction  is  given  in  [5]  and  details  for  the 
computation  of  both  directions  are  set  out  below. 
We  note  that,  in  the  limit  these  two  directions  are 
orthogonal,  hence  the  terminology. 

A  linear  search  follows  that  incorporates 
several  simple  features  of  the  algorithm  and 
numerical  results  to  date  indicate  that  the 
resulting  algorithm  is  very  efficient. 
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2.1     Notation  and  Orthogonal  Decompositions. 

When  we  want  to  find  the  direction  of  search 

k  k 
at  a  point    x      we  set     z  =  z^^  =  M^(x  ).     By  doing 

so,  a  near  active  function  in  the  minimax  problem 
corresponds  to  a  near  active  constraint  in  the  non- 
linear programming  problem,  viz. 


E(x'',e)  =  {i£[M]  Im^Cx*^)  -  fCx'^)'  <e} 
=  {ie[M]        -  tA^)  '  <  e} 


(12) 


We  note  that    E(x  ,e)  can  be  considered  dependent 

k  k 
only  on    x      since     z,    =  M^(x  ).  Let 
■'  k  f 

E(x'',e)  =  NE(x'',e)    uLE(x'^,e),  (13) 

where    LE(x  ,e)     is  the  index  set  for  the  near- 
active  linear  constraints. 


1^  k 
I(x  ,£)  =  [M]\E(x  ,e). 


(z,x) 


V(j)^  (z,x) 


(z,x) 


(14) 

f|.(z,x) 

_J  1  ^ 


3z 


3x, 


3x 


=  [1,  -  Vf.(x)]\ 
We  note  that     V(()^(z,x)     is  independent  of  z. 


e  =  [1,0, 


,0] 


(=  Vz), 


(15) 
(16) 

(17) 


If  the  n  by  k  matrix  A  has  independent  columns 
it  can  be  factored  into  the  form 


A  =  Q 

R 

=  [Q,:Q2] 

0 

0 

(18) 


where    Q  =  [Q-|^*Q2-I  an  n  by  n  orthogonal  matrix 

and    R    is     k  by  k,  right  triangular,  and  non- 
singular.     Q      denotes  the  first    k    columns  of  Q. 

Let     ^(A  )  denote  the  null  space  of     A,  i.e., 

^{t^)  =  U\k\=  0}.  (19) 
-»  T 

The  projector  on    C^{k  )  is  given  by  the  matrix 

(20) 


T 

Q2Q2- 


Further,  suppose    h    lies  in  the  span  of    A,  i.e., 

k 

h=Aw=yw.a.,  (21) 


\^,Vi' 

1=1 


;where    A  =  [a^a^. 


.a^^]  ,   then    w     is  given  by 

w  =  R'-^Q^h.  (22) 

'  T 

-Write    w  =  [w^.-.w^^]   .     In  the  context  of  this 

paper,  and  in  the  light  of  the  introduction,  we  are 
considering  the  columns  of    A    to  be  a  subset  of 
the    V(l).(z,x)'s  where     i  e  E(x,e). 

Thus  the  direction  of  (5)  is  given  by 


(23) 


As  we  shall  shortly  see,  we  will  actually 
build  up    Q    by  adding  columns  to    A    one  at  a  time. 


Furthermore,  in  order  to  eliminate  unnecessary 
re-evaluation  at  subsequent  iterations,  those 
columns  that  correspond  to  linear  functions     f . (x) 
will  be  added  in  first. 

At  this  point  let  us  accept  the  following 
convention:  if,   in  a  particular  context,  there 
will  be  no  ambiguity,  we  may  denote  functional 
expressions  dependent  upon    x    and  sometimes  e 
more  simply  by  abandoning  one,  or  often  both,  of 
its  arguments.     Thus    LE(x,e),  NE(x,e),  q(x,£), 
E(x,e),  etc.,  will  sometimes  be  denoted  by    LE(x) , 
NE(x) ,  q(x) ,  E(x),  etc.,  or  sometimes  as  simply  as 
LE,  NE,   q     or     E,  etc. 


2.2  The  Algorithm 
Step  0:     Set  Label 


0 


0,  k  =  0,  vs  =  0,  x  =  x",  the 
and  a  value  of     e,  and  cpstop. 
T  is  used  in  the  linear 


starting  point, 

Set     T       ;  Note 

max  max 

search  algorithm.     It  denotes  an  upper  bound  on  the 

admissible  stepsize.     (Note  that  Label  is  used  to 

indicate  whether  a  vertical  direction  should  be 

taken  (Label  =  6)  or  not  (otherwise) .  Similarly 

vs     indicates  whether  the  vertical  step  was 

successful  (vs  =  1)  or  not  (vs  =  0)). 

k  k 
Step  1 :  Set     z^  =  M^(x  ).     At  the  point  x 

determine  the  active  functions  within  the  specified 

k 


tolerance 
k 


In  other  words,  determine    LE(x  ,e). 
If     k      0,  go  to 


E(x'^,£), 


NE(x  ,e)     and  hence 
Step  3. 

Step  2:  Determine  the  projection  matrix  initially. 
(This  step  involves  inner  iterations). 

Step  2a:     Set     j  =  0, 

Q  =  I, 

A  =  (f., 

J 


0 

q 


[The  n+1  xn+1  unit  matrix] 
(the  empty  set) 


Jq    and     Kq    are  index  sets  for  those  linear  and 

non-linear  functions,  respectively,  that  are  in 
the  projection. 

Step  2b:     If    LE(x  ,e)  =  i)     go  to  step  2d  . 
Otherwise  let 


[Recall  that 


is  the 


matrix  made  up  of  the  last  (n+l-k)  columns  of 
and     j  =  {the     i     that  maximizes 

q'^V(() .  (z,x^)       |i  e  LE(x'^,e)  -  J„  } 


Q] 


0 


If     q'^V(t>j  (z,x^) 


q||    I  |v*,(z,x'')|  I 

0     go  to  step  2c,  otherwise  go 


to  step  2d. 

Step  2c :     Add  a  column  to    A    and  update    Q  and 
accordingly.     Writing    A  =  [A:a]  where 
a  =  V(t).(z,x  )     we  have 


6  .-v. 


187 


where  v. 


T 

Q^a,  V, 


T 
Q2a. 


'1      -^l"'  '2 

Now  we  may  take  a  product  of  Givens  transformations, 
G      to  transform    v.     into  a  vector  having  zero  in 

V  I 
all  the  components  save  the  first,  where    G^     is  of 

the  form 


Y  0 
a  -y 


where    y    and    -  y 
on  the  main  diagonal 


occupy  position    v     and  v+1 
Then 


=  [Q  .. 

\+l 

1[G 

=  Q 

R  v; 

=  Q 

R 

^t_ 

0 

Set 


k+1   • • • 

R  v^ 
^2 

(v^=  [llv^l 

2,0,.. 

A  =  A, 

Q  =  Q, 

R  =  R, 

LE  =  LE\{ j } . 

T 

,0]'). 


If  the  cardinality  of  reaches     n+1,  go  to 

Step  9  below.     Otherwise  go  to  step  2b. 

Step  2d:     Store  and  in  and 

respectively.  _^ 
Store    R    in  R. 

[In  the  cases  where    LE(x  ,£>     is  originally  empty 


the  storage  of  Q 


1 


and 


Step  2e:  If  NE(x  ,£) 
Otherwise  set 

j  =  {i  that  maximizes 


is  unnecessary] 

(})     go  to  step  4. 

T  k 
q  V(fi_j^(z,x  ) 

|q||  ||V4.(z,xS|| 


i  e  NE(x  ,e)  -  k^} . 


Otherwise,  add 


T  k 
If     q  V(ti.(z,x  )   <  0     go  to  step  4. 

k"' 

V<().(z,x  )     to    A    and  update     Q    and    R    as  out- 
lined in  step  2c  above. 
Set  =         u  {j}, 

NE  =  NE\{j}. 

If  the  cardinality  of  J 
step  9,  otherwise 


0  ^'^O 
T 

^2^2^ 


reaches    n+1     go  to 


and  return  to  the  start  of  step  2e. 

Step  3 :  Determine  the  projection  matrix  (this 
step  involves  inner  iterations) . 


Step  3a: 


[Recall  that     Q     is  that  part  of  the  decomposition 
that  corresponds  to  just  those  linear  functions 
f . (x)     included  in  the  projection  during  the 

previous  iteration.     If  storage  restrictions  make 
it  necessary    Q    may  be  discarded  and  recomputed]. 

T  k 
Let     q  =  Q2Q2^-  L^^''  •  =  )  "        "  *'  ^° 

step  3d.     Otherwise,  choose     j  e  LE(x  ,e)  -  J  , 

k 

add     V(!).(z,x  )     to    A    and  update     Q    and     R  as 
above.    [Step  2c ] . 

Set  -^0  "  -^0   ^  ^j^' 

LE  =  LE\{j} 

q  =  Q2Q2e- 

If     q  ?^  0     go  to  step  3b. 

Step  3c:  Drop  columns  from  A  and  update  Q  and 
R    accordingly.  Suppose 

A  =  [a^,a2,...,a._^,a.^^,...,aj, 

i.e.,  A  is  A  with  the  j th  column  deleted.  The 
resulting  factorization  is  of  the  form 

where    H    is  right  Hessenberg  and  is  obtained  from 
R    by  deleting  is  j th  column.     The  subdiagonal 
elements  in    H,  which  extend  from  the  new  jth 
(previous  (j+l)th)  column  of    R     to  the  last 
column  can  be  removed  by  applying  an  appropriate 

Thus  we 


sequence  of  Givens  transformations, 
may  write 


Q[G. 

R 

0 


q  =  0     implies  that 
columns  of     A,  i.e.. 


e    lies  in  the  span  of  the 


Since 


A  =  qFr" 


Aw  =  e. 

the  vector  w 


is  given  by 


-1  T 
w  =  R    Q  e, 

T 

set     y  =  Q^e  and  solve 
Rw  =  y. 


If  w  >  0,  stop,  X  is  optimal.  Otherwise,  choose 
j     such  that    w.    <0.     [Typically  one  chooses  j 

such  that    w,     is  most  negative].     and  drops  the 

corresponding    a^     from    A,  updating  the     Q  and 

as  outlined  above. 


R 


Set 


Step  3d 


0 
LE 


Jo\{j}, 


LE  +  {j} 
As  for  step  2d  above. 


R  =  R. 


Step  3e:     As  for  step  3d  above.     [Except  that  the 
last  statement  is  replaced  by  "return  to  the 
start  of  step  3e"] . 
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Step  4:     Set     d  =  -  q. 

Let    A=  {i^   ...  iJ     denote  JqUK^. 


Step  5:     Check  if  optimum  is  reached. 
If     I  |d  I  I    <  estop  and  Label      6  or  if  e 
stop. 


<  epstop , 


Iterations  J . 

k  k 
Set     z,    =  M  (x  ).     Determine     E(x  ,e).  (Note 

k 

that    X      is  the  point  obtained  from  the 
horizontal  step)  . 

Step  8a:     Set     0-^  =  5-,^.     ^^2  "  ^2'     ^  =  ^• 


Step  6:     Linear  Search 

The  linear  search_is  done  directly  on  the  min- 
imax  function.     Let    d      be  the  direction    d  of 
step  4  with  the  first  component  deleted  (remember 
that  the  first  component  of    d  (d"*")  corresponds 
to  the     z    and    d      corresponds  to    x  e  r"^)  .  Then 
d      has  the  property  that  it  will  try  to  decrease  ^ 
the  subset  of  the  active  functions  at  the  point    x  , 
f.   (x),   f.    (x) ,  f.    (x),    (i.e.,   those  we  put 

^1  ^2 

into  the  projection)  by  the  same  amount.  (Thus 
decreasing    M  (x)  ,  since  ,  by  construction,  the 

remaining  active  functions  at  the  point    x  will 
locally  decrease  along    d    ,  as        this  is  the  basis 
on  which  the  projection  matrix  was  determined ). Thus 

at  the  point    x      we  move  downhill  along  the  valley 

defined  by  the  minimax  functions     f.    (x),f.    ,  , 

1^  i2(x), 

...,f.    (x) .     Note  that  if  we  consider  only  one 
^  j 

function  in    P,  say  the  function    f.    (x) ,  then 

^1 

d    =  -  Vf.   (x) ,  which  is  the  steepest  descent 
^1 

direction  for  the  function     f.    (x) . 

\  . 

Determine     t  >  0     such  that 

k  - 
Max  f^(x    +  Td  ) ,   i  £  [M] 

is  minimized.     Call  this     x,  igp)- 

For  details  of  how  this  is  done  and  whether 
exact  minimization  is     required,  see  the  sub- 
algorithm  "The  Linear  Search  Algorithm"  below.  Put 

X    -<-  x    +  T        d  . 

opt 

Step  7 :     Decision  as  to  whether  to  do  the  vertical 
step  or  not.     The  vertical  direction  amounts  to 
attempting  to  make  the  near  active  functions 
exactly  equal,  and  by  doing  so  an  effort  is  made  to 
get  exactly  on  the  line  of  the  discontinuous 
derivatives,  which  is  very  desirable  when  we  are 
close  to  the  solution.    We  expect  that  we  will  be 
close  to  the  solution  either  when  the  active 
functions  remain  the  same  at  each  iteration  or  if  we 
are  trying  to  consider  more  than    n    functions  in 
the  projection.     [Actually,  even  if  we  are  not  near 
the  solution,  should  either  of  these  two  phenomena 
occur  we  might  need  to  "reach  the  valley  floor"  in 
order  to  move  away  from  the  current  situation] . 
Algorithmically ,   if  the  number  of  active  functions 
have  not  changed  in  three  consecutive  iterations, 
K„      0,  and     |  |d|  |    <  .1,  or  if  Label  =  6  (see  step 
9;  go  to  step  8.    Otherwise,  set 


k+1 


k+1 


and  go  to  step  1. 


Step  8b:     As  for  step  2e  above  [except  that  the 
last  statement  is  replaced  by  "return  to  the 
start  of  step  8b"  and  "go  to  step  4"  is  replaced  by 
"go  to  step  8c"] . 


Step  8c:  v(x  ,e) 
where     <t'  =  (<fi. 


Put     X  =  X     +  V 

temp  ^ 

=  X     +  V 


Q^R 
.  7. 


(with  1st  component  missing) 


If 


set  vs 


Max  f . (x  )  <  Max  f . (x  )  , 
±eM   ^  ie[M]  " 


k+1 


X        ,  k  *  k+1, 
temp 


Label  =  0     and  go  to  step  1.  Otherwise  set  vs  =  0, 


k+1 


k+1 


and  go  to  step  1. 

k  T      k    -  k  + 

(Note  that     f ^  (x  )  +  Vf^  (x  )v    =  M^(x  )  +  v  + 

where     |e   |'<e,     11=1,  2,    ...,j,  and    v"*"    is  the 

first  component  of    v(x  ,e).     Therefore  the 
linearized  active  minimax  functions  at  will  be 

k 

within    e     at  the  point    x  =  x    +  v  ) . 

temp 

Step  9:     Trying  to  put     (n+1)     constraints  in  the 
pro j ection,  i.e.. 


n+l. 


|Jq   'J      I  =  n+l     means  one  of  two  possibilities, 

either  a)  we  have  reached  the  neighbourhood  of  the 
optimum,  or  b)  we  have  constraints  considered 
active  that  actually  are  not.     This  situation  is 
handled  in  two  ways.     Firstly,  by  ensuring  that  we 
take  a  vertical  step  and  secondly,  by  reducing  e. 
Algorithmically,  set  label  =  6,  put  e  =  e/10  and  go 
to  step  8. 

The  Linear  Search  Algorithm 

Step  0:  If  vertical  step  was  successful  (i.e.,  if 
vs  =  1),  go  to  step  4. 

Step  1 :     Estimate  any  new  function  to  become  active. 

We  consider  all  inactive  constraints     d) .  and 

3 

estimate,  in  turn,  the  stepsize  to  make  each  (ji^ 
zero.     Hence  we  calculate 


<t'.(z^^.x  ) 

-i  T  k 

J        V(f>^:(z,   X  )d 


T  . 


j  e  I(x  ,  e). 


(24) 


Step  8:     The  Vertical  Step  [involves  inner 
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In  other  words,  if     (p.     was  linear  we  calculate  t 
k 

so  that     d)   ((z,  ,x  )  +  T.d)  =  0     and  in  general  we 

J  J  k  k 

linearize  the  <t>'s.    Let    f  (x  )  =  M  (x  )     for  some 

ml 

m,     f.(x)     an  inactive  function  at  the  point  x 

and    T      as  calculated  by  24.     Then  we  want  the 
j 

linear  approximation  of  the  functions     f      and  f 
k 

about  the  point    x      to  be  intersected  at  the 
k 

point    X    +  T  d   .     By  construction    q  is 

J  k 
orthogonal  to    y(z  -  f  (x))     at  the  point    x  , 
°  m 

i.e.  , 

Vf  (x'')'^d"  =  d"^.  (25) 
m 

Also,  by  construction  of  t. 


f  .(x*")  +  T.[l,  -  Vf  .(x^)^] 
J  J  J 


By  using  the  fact  that  =  M^(x  )     we  have 


k,T,- 


f.(x  )  +  T.Vf.(x  )  d     =  z    +  T  d 


From  25 


f  (xS  +  T.Vf  (xS^d' 
ra  J  ni 


=  M^(x  )  +  T^d 

=  f  (x^)  +  T  .d"^ 
m  J 

=  f    (x*^)  +  T.d"^ 
m  J 


(26) 
(27) 


Comparing  (26)  and  (27)  we  can  see  that  we  have 
the  desired  result. 

Step  2:  Omitting  unlikely  values  of  ,  estimate 
the  optimum,  t       ,  by  linearizing  the  minimax 

function  about    x  .     For     j   e  I(x  ,e)     do  the 

following.     If     T.  <0     or     x . >t  neglect  it  as 

"  j  J  max 

inadmissible.     Otherwise  calculate 

t  .  =  f . (xS  +  x.Vf .(x^)^d",     i  e  [M] 


where 


Put 


Vf . (x) 
1 


T^.(x), 
3x .  1 
1 


f  .(x) 
dX .  1 


1  = 


Now,  determine 


ie[M] 
such  that 


F    =  Min  P. 


j   £  I(x  ,t;)\{j|x_.  <0     or     x  >  xmax}  , 


Put 


opt 


Step  3:  Determine  if  x  is  acceptable. 
 —  opt 


Calculate  the  true  minimax  value  at 

d~. 


Ak  k 

X      =   X      +  X 


opt 

If  this  new  value  is  an  improvement  over  the  old 
k  Ak 

value,  set    x    =  x  .     Otherwise  go  to  step  4. 


Step  4:     Use  cubic  line  search  on  the  maximum  of 
the  functions  taking     x  as  an  upper  bound,  to 

k 

obtain  the  new    x  ,  if    x  is  available  from 

'  opt 

step  3  [i.e.,  if    vs  =0]. 

2.3  Some  remarks  on  the  algorithm 

i)  As  is  noted  in  [4]   the  QR  decomposition  and 
its  update  when  a  column  is  added  or  deleted  from 
A,  arises  with  such  frequency  that  a  wealth  of 
techniques  have  been  developed  to  handle  the 
situation.     The  method  set  out  here  corresponds  to 
that  in  [11]  .     It  is  used  presently  solely  because 
it  is  the  simplest  to  explain.     For  alternative 
methods,  more  details,  and  references,  see  [4]. 

ii)  In  practise  we  do  not  use  an  exact  cubic 
linear  search  but  merely  ask  for  sufficient 
improvement  in  the  minimax  value. 

iii)  If  in  step  7  of  the  main  algorithm  we  decided 

to  do  the  vertical  step  and  it  was  successful,  we 

dispense  with  the  estimation  of     x  as  above  and 

opt 

merely  do  the  cubic  search .     The  motivation  for  this 

is  as  follows.     The  estimates  for     x.     are  based  on 

1 

the  surmise  that  some  new  function  will  become 
active  whereas  the  vertical  step  is  based  on  the 
assumption  that  this  will  not  be  the  case. 

iv)  In  the  above  algorithm  we  have  assumed  that 
the  columns  given  by    V(t)^(x,E),  i  e  LE(x,e)  are 

linearly  independent.     The  situation  in  which  the 
columns  are  dependent  corresponds  precisely  to 
degeneracy  in  linear  programming.     As  was  done  in 
[4],  we  will  regard  degeneracy  as  a  condition 
which  can  be  cleared  up  through  the  use  of 
negligible  perturbations  of  the  data,  and  we  shall 
ignore  the  linearly-dependent  case. 

v)  Under  some  fairly  natural  assumptions  the 
above  algorithm  can  be  proved  to  be  convergent. 
The  proofs  are  somewhat  lengthy  and  are  given 
elsewhere   [5] .     In  the  case  of  solely  linear 
functions,  finite  convergence  can  be  proved. 
[See  [4]]. 

2.4  Relationship  of  the  above  with  other 
algorithms 

i)  The  algorithm  differs  from  that  given  in  [5] 

in  that  it  is  a  stable  implementation.  Furthermore, 
whenever  possible,  advantage  is  taken  of  any  linear 
functions  that  might  be  active.     This  is  done  by 
preferentially  entering  the  linear  functions  into 
the  projection  and  storing  that  part  of  the 
decomposition  (QR)  that  corresponds  to  the  linear 
functions  alone.     Thus,  instead  of  recomputing  the 
entire  decomposition  at  each  iteration,  the  linear 
part  can  be  updated. 

ii)  As  was  remarked  above,  in  the  purely  linear 
case,  the  algorithm  2.2  does  not,  quite,  reduce  to 
the  algorithm  of   [4].     This  is  because  in  that 
algorithm    M^(x)  =  Max|f^(x)[     and,  in  addition, 

i 

the  linear  search  differs  from  that  above.  Thus 
the  corresponding    P*  is 


Min  z 


subject  to 


>  0, 


£  [M] 


(28) 
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It  is  apparent  that  the  above  problem  is  equivalent 
to 

Min  z 


subject  to, 


-  f  .(x)  >  0,^ 
+  f  .(x)  >  0,j^ 


i  e   [M]  P*' 


with  corresponding 


Min    Max  {f . (x) ,  -  f . (x)}  P' 
X  ie[M] 


If  one  formulates  the  problem  this  way,  eliminates 
the  vertical  step  on  account  of  it  being  super- 
fluous in  the  purely  linear  case,  then,  excepting 
the  linear  search,  one  has,  essentially,  the 
algorithm  of  [3].     For  completeness  I  point  out 
that  they  use  fast  Givens  without  square  roots  in 
their  decomposition  of    A     instead  of  the  QR 
given  above.     (The  elements  of  fast  Givens  are 
given  in  [13]  )  . 

There  are  three  linear  searches  mentioned  in 
[3].     Besides  the  complexity  introduced  by  the 
absolute  modulus  function,  none  of  the  methods 
correspond  exactly  to  that  given  in  algorithm  2.2, 
(henceforth  referred  to  as  LSA(2.2)).  For 
simplicity  we  will  describe  equivalent  forms  for 
the  original  problem  P. 

Linear  Search  Algorithm  A.  (LSA(A)). 

As  for  LSA(2.2)  but  determine     I    such  that 

is  the  smallest  positive         ,     j  e  I(x  ,e), 

[i.e.,  locate  the  first  break-point  reached  in  the 
direction    d      of^^the  piecewise  linear  function 
M^    on  leaving    x  ] . 

Linear  Search  Algorithm  B.  (LSA(B) 
Step  1:     As  for  step  1  of  LSA(2.2). 

Step  2:     Put     t      as  the  smallest  positive  t., 
k 

j  e  I(x  ,e)  . 

Step  3:     [Determine_whether    M^     is  still  decreasing 
in  the  direction    d  ]. 
-  k 

If    d  Vf^ix  )>  0,   terminate  the  linear  search 

algorithms  with    t        =  t„. 
"  opt  ?, 

Otherwise  it  is  necessary  to  make  the 

following  redefinitions 

k        k  - 

X    -<-  X     +  T  d 

k  k  - 

z    -<-  M^  (x    +  Tj^d  ) 

and  return  to  step  1. 

Linear  Search  Algorithm  C.  LSA(C) 

This  algorithm  was  a  mixture  of  LSA(A)  and  (B) . 
Specifically  it  used  LSA(A)  until  such  time  as 
q  =  0  in  step  3b  of  algorithm  2.2,  above.     Then  in 
step  6  it  used  LSA(B) .     In  subsequent  iterations 
one  returned  to  LSA(A)  for  step  6  until     q  =  0 
again  in  step  3b,  and  so  on. 

Schematically  we  have  the  following 
situation  for  the  linear  search. 


Figure  3 . 


a  corresponds  to     t  from  LSA(A) 

opt 


Y 


"  LSA(2.2) 
"  LSA(B) 


Using  LSA(A) ,  with  the  exception  of,  possibly, 
the  first    n+1     iterations,  each  of  the  points 

X      were  vertices  for  the  (linear  programming) 

problem    P*'    (or  at  least  satisfied  a  maximal 

number  of  active  constraints  in  case  the  problem 

was  rank  deficient) .     It  is  for  this  reason  that 

they  preferred  strategy  LSA(A) .     However,   in  the 

case  where  the     f.(x)'s  are  non-linear, 
1 

essentially,  as  is  usual  in  non-linear  programming, 
one  is  looking  only  for  a  sufficient  decrease  in 
the  objective  function,  M^(x).     Strategy  LSA(B)  is 
undesirable  in  that  it  requires  too  much 

additional  computation. 

iii)  The  algorithm  of   [10]  corresponds  exactly  to 
the  algorithm  above  using  LSA(A)    (except  that 
Cline  is  restricted  to  Haar -condition  problems) 
starting  at  a  vertex  to  the  (linear  programming) 
problem    P' .     In  addition  to  being  able  to  handle 
non-Haar  problems,  the  treatment  given  above  is 
more  flexible.     In  particular  the  addition  of 
constraints  can  be  handled  in  a  natural  way  that  is 
particularly  simple  in  the  case  of  linearly 
constrained  problems. 

iv)  Perhaps  the  best    known  algorithm  for  handling 
problems  of  type  (28)  in  the  1  inear  case  is  that  of 
Barrodale  and  Phillips   [2].     They,  in  fact,  add  an 
additional  linear  constraint,  viz. 

z  >  0 

to    P* '     and  then  solve  the  dual  linear  program. 
The  structure  of  the  problem  enables  them  to 
condense  the  resulting  tableau  by  supressing  all 
but    m    of  the  columns.     There  is  some  suggestion 
that  in  practice  the  dual  form,  at  least  for 
linear  problems,  is  the  best  approach  to  the  £ 

oo 

problem.     However,  to  the  best  of  the  present 
author's  knowledge,  there  is,  as  yet,  no  well- 
documented  evidence  to  support  this  assertion.  In 
particular,  let  us  formulate  the  algorithm  of 
Barrodale  and  Phillips  in  terms  of  the  primal.  The 
algorithm  consists  of  three  stages.  More 
specifically,  suppose  the  problem  under 
consideration  is 

f  n  -I 

Mini   lb -Ax  I  I     =     Max   lb.   -  ya..x.|| 
^   L  "      l<i<m     ^       j=l         J  J 

and  the  rank  of    A    is  k. 

In  stage  1  one  chooses  the  residual  of  maximum 
absolute  value  and  essentially,  uses  the  steepest 
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descent  direction  to  bring  its  value  to  zero.  One 
then  updates  the  residual  values  choosing  that 
which  is  now  maximal  in  absolute  value  and 
bringing  it  down  to  zero,  whilst  still  holding 
those  residuals  that  are  already  at  zero  at  zero. 
One  then  iterates  until    k     residuals  are  at  zero. 

This  is  readily  accomplished  by  means 
analogous  to  those  above  for  algorithm  2.2.  For 
example,  suppose 


b. 
1 


n 

I  a 
j=l 


X . 

ij  J 


0 


=  1,2, 


and 


a  .  X . 
rj  J 


Define  P 


is  maximal . 
as  in  (3)  with    N  = 


%1 


if 


where     a.  =  [a_    ...         ,  . 

The  required  descent  direction  is  given  by 
d  =  Pg    where     g  =  a^  sign(e^) .     (sign  (e^)  =  1 

e    >    0,  and     -1     if     e     <  0    and     0  otherwise), 
r  r 

Since  in  stage  1,   typically  one  increases  the 
number  of  residuals  with  value  zero  by  1,  the 
orthogonal  decompositions,  plus  corresponding  up- 
dates, are  applicable  here.     Stage  1  is 
terminated  when    k    residuals  have  the  value  zero. 

Stage  2  consists  of  increasing  all  the  k 
residuals   (in  absolute  value)  simulateously ,  until 
k+1     residuals  have  the  same  absolute  value. 
Stage  2  involves  only  one  iteration. 

Stage  3  is  equivalent  to  the  well-known 
exchange  algorithm  for  linear  minimax  approximation 
(see  for  example  [14]). 

Again  Stages  2  and  3  can  be  accomplished  with 
linear  algebra  analogous  to  that  of  algorithm  2.2. 

The  above  is  only  a  very  brief  outline  of  the 
algorithm  of  Barrodale  and  Phillips.  The  details 
of  the  condensed  simplex  implementation  are  given 
in  [1]. 

In  the  light  of  the  above  comments  one  is  able 
to  see  that  the  "dual"  formulation  of  Barrodale  and 
Phillips  can  be  realized  in  a  stable  way  by  a  dir- 
ect formulation  as  for  algorithm  2.2.     It  then  seems 
natural  to  ask  the  question  that  if  the  dual  approach 
to  the  minimax  problem  is  superior  to  the  direct 
approach,  when  is  a  dual  approach  indirect?  Further- 
more, Barrodale  and  Phillips  claim  that  the  main 
feature  of  their  algorithm  is  that  it  enters  stage 
3  with  a  value  close  to  optimal.     However,   if  one 
compares  how  one  arrives  at  stage  3,  in  terms  of 
the  primal  problem,  to  the  analogous  situation  in 
algorithm  2.2  [i.e.,  when  one  reaches  a  vertex  (or 
point  where    q  =  0,  more  generally)]   the  apparent 
superiority  of  the  "indirect"  or  "dual"  methods 
appears  surprising . 

v)      Much  of  the  ideas  of  [17]  are  in  fact 

equivalent  to  that  of  the  horizontal  direction  of 
algorithm  2.2.     However,  the  vertical  direction  is 
dispensed  with  at  the  cost  of  having  to  reduce  the 
tolerance  to  which  functions  are  considered  active, 
to  zero.     This  makes  ultimate  convergence  difficult 
(cf.    [7]  and  [9]  for  a  similar  result).  Further- 
more the  numerical  aspects  of  the  Zangwill 
algorithm  are  not  considered  by  him.     In  partic- 
ular, no  linear  search  suitable  for  minimizing 
M^(x)     is  given.     Finally,   Zangwill  introduced 

separate  cases  dependent  on  whether  a  certain 
matrix  is  of  full,  or  almost  full,  ranks.     Such  a 


differentiation  of  cases  is  numerically 
undesirable  and  does  not  occur  in  algorithm  2.2. 

2.5     Some  additional  remarks  and  conclusions. 

As  is  noted  above,  and  in  [4]   for  the  linear 
case,   the  addition  of  constraints  to  problem  P 
can  be  handled  in  a  natural  way.     This  is  an 
immediate  consequence  of  the  fact  that  algorithm 
2.2  is  based  on  projections. 

One  might  also  remark  that  since  the 
projections  in  algorithm  2.2  are  orthogonal,  at 
least  in  terms  of  the  non-linear  programming 
problem    P* ,  the  iterations  consist     of  steepest- 
descent-like  iterations  in  certain  subspaces.  It 
might  be  suggested  that  one  might  be  able  to 
determine  Newton  or  Quasi-Newton  type  iterations  by 
using  non-orthogonal  type  projections  (see  for 
example  [15]  and   [12]  for  parallels  in  non-linear 
programming).     However,  since  the  objective 
function  (z)  in    P*     is  linear,  the  straightforward 
non-orthogonal  projection  will  not  do  the  job. 
However,  consider  the  case  of  just  one  function 
f^(x),  say,  active.     Algorithm  2.2  gives     -  Vf^(x) 

for  d  -  i.e.,  steepest  descent.  However,  for 
general  non-linear     f     the  corresponding  Newton 

direction  -  G^^(x)f(x)     is  well  defined   [where  G. 

f  J 
denotes  the  Hessian  of     f;  here  assumed  invertible] 
In  other  words,  a  second-order  type  generalization 
is  not  obvious,  even  assuming  it  is  desirable. 

The  above  is  meant  to  be  a  summary  of  one 
(direct)  approach  to  minimax  problems  via 
projections.     Some  attempt  has  been  made  to  relate 
the  approach  to  other  algorithms  designed  for  the 
same  problem.     It  is  also  hoped  that  the  results 
reported  here  indicate  that  there  is  still  work  to 
be  done  in  the  area  with  suggestions  as  to  some 
avenues  of  future  research. 

Numerical  results  are  not  reported  here. 
Suffice  it  to  say  that  numerical  results  have  been 
carried  out  in  the  context  of  applications,  general 
non-linear  and  linear  problems.     They  are  reported 
in  detail  in  references   [5],    [3]  and[6]. 
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Abstract 

This  paper  discusses  two  algorithms  for  non- 
linearly  constrained  optimization.     These  algo- 
rithms --  the  penalty  and  barrier  trajectory  algo- 
rithms --  are  based  on  an  examination  of  the  tra- 
jectories of  approach  to  the  solution  that  charac- 
terize the  quadratic  penalty  function  and  the 
logarithmic  barrier  function,  respectively.  Al- 
though closely  related  in  principle,  the  two  algo- 
rithms display  important  differences  in  their 
implementation  as  well  as  in  the  properties  of  the 
generated  iterates.     The  discussion  will  emphasize 
the  numerical  aspects  of  implementation  of  the 
trajectory  algorithms,  with  particular  attention 
to  the  choice  of  reliable  methods  for  carrying 
out  the  required  computations. 

1 .  Introduction 

The  problem  to  be  considered  in  this  paper 
is  the  following: 

PI:  minimize        F(x) ,  x  G  e" 

subject  to     c^(x)^0^  i  =  l,2,...,m, 

where    F(x)     and     {c^(x)}     are  prescribed  non- 
linear functions.     The  function    F(x)     is  usually 
termed  the  "objective  function"  and  the  set 
{c^(x)}     is  the  set  of  "constraint  functions".  It 
will  be  assumed  for  simplicity  that     F    and  {c.} 
are  twice  continuously  dif ferentiable,  although 
the  methods  to  be  discussed  will  cope  with  occa- 
sional discontinuities. 

The  solution  of  PI  will  be  denoted  by  x*. 
In  all  problems  to  be  considered,  the  first-  and 


second-order  Kuhn-Tucker  conditions  are  assumed  to 
be  satisfied  at     x*,  so  that  there  exists  a  vector 
X*    of  non-negative  Lagrange  multipliers,  corre- 
sponding to  the  active  constraints  at     x*,  satisfy- 
ing 

g(x*)  -  A(x*)A*  =  0  (1) 

where     g(")     is  the  gradient  of    F,  and  the  columns 

of    A(-)     are  the  gradients  of  the  constraints 

active  at  x*. 

It  is  customary  to  state  problem  PI  with  two 

kinds  of  constraints  —  equality  constraints  (of 

the  form    c . (x)  =0),  as  well  as  the  inequalities 
3 

given  above  in  PI.     This  distinction  will  not  be 
made  during  the  present  discussion,  in  order  to 
avoid  the  introduction  of  additional  notation.  The 
treatment  of  inequality  constraints  is  typically 
more  complicated  than  that  of  equality  constraints, 
and  the  algorithms  to  be  discussed  can  deal  in  an 
obvious  way  with  equality  constraints. 

Because  the  problem  PI  can  not,  in  general, 
be  solved  explicitly,  a  popular  approach  during 
the  past  decade  has  been  to  transform  PI  into  a 
sequence  of  unconstrained  minimization  problems. 
The  most  common  such  transformation  has  been 
effected  by  the  use  of  penalty  and  barrier  function 
methods,  which  are  discussed  at  length  in,  for 
example,  Fiacco  and  McCormick  (1968)  and  Ryan 
(197A).     This  paper  will  not  review  these  methods, 
but  their  properties  are  critical  in  the  deriva- 
tion of  the  algorithms  to  be  discussed. 

The  quadratic  penalty  function  corresponding 
to  the  problem  PI  is  defined  by 

P(x,p)  E  F(x)  +^    I     (c. (x))^  ,  (2a) 
^  iEi  ^ 
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where     I     is  a  subset  of  the  indices  {l,2,...,m} 
(usually,  the  set  of  constraints  whose  values  are 
less  than  a  small  positive  number) ,  and    p     is  a 
positive  scalar  termed  the  "penalty  parameter". 
The  quadratic  penalty  function  may  also  be  written 
as : 

P(x,r)   =  F(x)  +  ^  r~^  ^     (c. (x))^  ,  (2b) 
^  iGl 

where  r  =  1/p.  Let  x*(r)  denote  an  unconstrained 
minimum  of  P(x,r). 

The  logarithmic  barrier  function  corresponding 
to  the  problem  PI  is  given  by: 

m 

B(x,r)   =  F(x)  -  r    I     (^n(c  (x))    ,  (3) 
1=1 

where    r     is  a  positive  scalar  termed  the  "barrier 

parameter".     The  logarithmic  barrier  function  is 

defined  only  at  points  that  strictly  satisfy  all 

of  the  problem  constraints.     Let     x*(r)  denote 

B 

an  unconstrained  minimum  of  B(x,r). 

Under  certain  mild  conditions,  there  exists 

r  >  0     such  that  for    r  <  r,  x*(r)     and  x*(r) 

r  B 

are  continuous  functions  of     r,  and: 

lim    x*(r)  =  X*  ; 
r  ^  0 

lim    x*(r)  =  X*  . 
r  -  0  ^ 

Although  penalty  and  barrier  function  methods 
display  several  good  features,  they  suffer  from 
certain  theoretical  and  numerical  defects  --  in 
particular,  both  require  the  solution  of  a 
theoretically  infinite  sequence  of  unconstrained 
problems.     Furthermore,  in  practice  each  suc- 
cessive unconstrained  sub-problem  is  more  dif- 
ficult to  solve,  because  the  Hessian  matrices  of 
the  penalty  and  barrier  functions  become  increas- 
ingly ill-conditioned  as     r     approaches  zero,  and 
are  singular  in  the  limit  (see  Murray,  1969a) . 

The  continuous  lines  of  minima  in  e'^ 
defined  by    x*(r)     and    x*(r)     are  termed  the 
"penalty  trajectory"  and  "barrier  trajectory"  of 
approach  to    x*,  respectively.     The  analysis  of 
these  trajectories,  and  a  detailed  description  of 
the  trajectory  algorithms,  have  been  given  else- 
where (Murray,  1969a, b;  Wright,  1976;  Murray  and 
Wright,  1976b);  for  the  purposes  of  the  present 


discussion,  only  a  brief  description  of  the 
underlying  motivation  will  be  stated. 

The  trajectory  algorithms  are  based  on  using 
the  properties  of  the  trajectories  to  generate  a 
sequence  of  iterates  that  lie  in  a  neighborhood 
of  the  appropriate  trajectory,  in  order  to  mimic 
the  approach  to     x*     of  the  iterates  from  a  penalty 
or  barrier  function  method.     Because  it  is  possible 
to  characterize  a  step  toward  the  penalty  or  barrier 
trajectory  without  assuming  that  the  current  iterate 
is  in  a  close  neighborhood  of     x*,  the  derivation 
of  the  trajectory  methods  does  not  display  a 
stringent  dependence  on  properties  that  hold  only 
in  such  a  neighborhood.     Moreover,  it  is  not  neces- 
sary for  any  of  the  iterates  to  lie  exactly  on  the 
trajectory  (as  in  a  penalty  or  barrier  function 
method) . 

At  each  iteration  of  a  trajectory  method,  the 
search  direction  is  computed  as  a  step  toward  some 
point  on  the  desired  trajectory.     The  particular 
point  to  be  aimed  for  depends  on  the  current  value 
of  the  penalty  or  barrier  parameter.     The  solution 
x*     is  also  on  the  trajectories,  and  the  target 
point  will  ultimately  become  arbitrarily  close  to 
X*     as  the  algorithms  converge.     The  penalty  or 
barrier  parameter  may  be  adjusted  at  each  iteration 
of  the  trajectory  methods;  however,  the  choice  of 
the  parameter  value  is  not  critical,  since  a  step 
to  a  neighborhood  of  a  point  on  the  trajectory 
corresponding  to     r     is  also  in  the  neighborhood 
of  a  point  corresponding  to     (l+e)r,  where     e  is 
small.     The  numerical  procedure  for  determining 
the  search  direction  in  both  algorithms  is  well- 
posed,  and  the  approach  to  the  limit  of  the  penalty 
or  barrier  parameter  does  not  cause  any  ill-condi- 
tioning. 

This  paper  will  emphasize  some  numerical 
aspects  of  implementation  of  the  trajectory  methods, 
with  particular  attention  to  the  choice  of  reliable 
procedures  for  carrying  out  the  required  computa- 
tions.    The  emphasis  on  the  details  of  implementa- 
tion is  deliberate;  even  within  an  algorithm  that 
has  been  designed  from  the  outset  to  be  robust, 
additional  safeguards  are  necessary  to  protect 
against  failure  or  illogical  results  when  the  under- 
lying assumptions  are  not  satisfied. 
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2 .     Description  of  Trajectory  Algorithms 

Only  certain  key  aspects  of  implementation  of 
the  trajectory  methods  have  been  selected  for  dis- 
cussion in  Section  3.     Accordingly,  the  descrip- 
tions given  here  of  the  penalty  and  barrier  trajec- 
tory algorithms  are  slightly  abbreviated,  and  do 
not  contain  all  the  computational  details.  A 
complete  description  of  both  algorithms  is  given 
in  Murray  and  Wright  (1976b). 


2.1.     Penalty  Trajectory  Algorithm 

2.1.1.     Properties  of  the  search  direction 

For  the  penalty  trajectory  algorithm,  at 
each  iteration  the  search  direction,  p,  is 
(ideally)   constructed  as  the  solution  of  the  follow- 
ing quadratic  program: 


QPl: 


1    T  „    _^  T 
min    ^  p     Sp  +  p  g 


for  the  quadratic  penalty  function;   the  step  to 
be  taken  along  the  search  direction  is  then  chosen 
to  achieve  an  acceptable  decrease  in  the  penalty 
function. 

2.1.2.     Calculation  of  the  search  direction 

At  the  beginning  of  the  k-th  iteration  of  the 
penalty  trajectory  algorithm,   the  following  vectors^ 
and  matrices  are  assumed  to  be  available: 


,(k) 
,(k) 


subject  to  A 


-c  -  -  A 

P 


an  approximation  to     x* ; 

the  vector  of  values  of  {c.(x)) 
(k) 

evaluated  at     x  ; 

the  gradient  vector  of  F(x) 
(k) 

evaluated  at     x  ; 

the  matrix  whose  columns  are  the 

gradients  of     {c.(x)}  evaluated 

(k)  ^ 
at  x' 

an  approximation  to  the  Hessian 

matrix  of  the  Lagrangian  function 

(k) 
at     x  . 


where     c    denotes  the  vector  of  constraints  cur- 
rently considered  "active";  A    is  a  matrix  whose 
columns  are  the  gradients  of  the  active  constraints; 
A     is  an  estimate  of  the  Lagrange  multiplier  vector; 
p     is  the  current  value  of  the  penalty  parameter; 
g     is  the  gradient  of     F;  and     S     is  a  matrix  that 
approximates  the  Hessian  of  the  Lagrangian  function. 

Let    Y    be  a  matrix  whose  columns  form  an 
orthogonal  basis  for  the  range  of  the  columns  of  A, 
and  let     Z    be  a  matrix  whose  columns  form  an 
orthogonal  basis  for  the  corresponding  null  space, 
i.e.  , 


A  Z  =  0 


The  procedures  followed  during  the  k-th 
iteration  to  compute  the  next  iterate  are: 

(1)  An  "active  set"  of  constraints  is  deter- 
mined,  containing     <^  n    elements  (see  Section  3.1 
for  a  discussion  of  the  case  where  the  active  set 
contains  more  than    n    elements) .     The  vector  of 
active  constraint  values  will  be  denoted  by  c, 
and  the  matrix  whose  columns  are  the  gradients  of 
those  constraints  will  be  denoted  by  A. 

(2)  Factorize    A    such  that 


QA 


T 

Q  Q 


Z  Z  =  I  . 

If    A    has  full  column  rank  (£  n) ,  and  if 
T 

the  matrix    Z  SZ     is  positive  definite,   then  the 
solution  of  QPl,  p*,  can  be  uniquely  expressed  as 
the  sum  of  two  orthogonal  components: 


where  R  is  an  upper  triangular  matrix.  Define 
the  matrices    Y     and    Z    by  partitioning    Q  as: 


Q  = 


,T  n 


P*  =  Yp„  +  Zp  . 

For  sufficiently  large  p,  the  search  direction  p* 
so  constructed  will  always  be  a  descent  direction 


(3)     Determine  an  estimate.  A,  of  the  Lagrange 

multiplier  vector.     If    A    has  full  solumn  rank, 

(k)  2 

A  is  the  least-squares  solution  of  minlAA-g  Il2» 
and  is  given  by  the  solution  of  the  triangular 
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system: 


N 


llp^ll       and  llp^ll 
R  R 


(6) 


RA 


If    A    is  rank-deficient,  the  vector     A    will  be 
taken  as  the  minimum-length  least-squares  solution; 
it  is  obtained  by  extending  the  factorization  of 
Step  (2)  to: 


QAV  = 


(4) 


where    R     is  a  non-singular  upper  triangular 
matrix  and    V     is  an  orthogonal  matrix  (see 
Peters  and  Wilkinson,   1970,  for  further  details). 

(4)  Determine  an  appropriate  value  of  the 
penalty  parameter,  p       (Murray  and  Wright,  1976b). 

(5)  Compute  the  vector    p  ,  as  follows.  If 
A    has  full  rank,  p      is  obtained  by  solving  the 


R 


linear  system: 


for  M  a  reasonably  large  positive  number  (say, 
1,000)  . 

(a)     If  the  test   (6)   is  satisfied  (as  it 
almost  always  is  in  practice) ,  compute    p^  by 
solving : 


thtT  „    _     7T.„(k)  _^  „(k)  ^  , 


then  form  the  search  direction  as: 


P  =  Yp^  +  Zp^ 


(b)     If  the  test  (6)   is  not  satisfied,  the 
two  orthogonal  portions  of  the  search  direction 
are  not  well-scaled,  and  the  following  re-scaling 
procedure  is  used  to  adjust  for  the  imbalance. 


If 


N 


Mill 


R 


define  a  scaling  factor 


A  p  =  A    YPj^  =  R  V-n  =  -c 


1a 

P 


(5) 


In  this  way,  the  direction    Yp      satisfies  the 

linear  equality  constraints  of  QPl.     If    A  is 

rank-deficient,  p      is  a  least-squares  solution 
R 

of  (5) ,  computed  using  the  complete  orthogonal 

factorization  (4) ;  the  linear  constraints  of  QPl 

will  then  not  be  exactly  satisfied. 

(6)     Determine  the  modified  Cholesky  factor- 
T  (k) 

ization  of  the  matrix    Z     S      Z.     With  this  pro- 
T  (k) 

cedure,  the  matrix    Z     S      Z     is  augmented  (if 

necessary)  by  a  positive  diagonal  matrix,  E, 
T  (k) 

chosen  to  make     (Z     S      Z  +  E)     strictly  (numer- 

T 

ically)  positive  definite.     Let     LDL      be  the 
computed  factorization,  so  that: 


T        T  fk') 
LDL    =  Z  ■^Z  +  E  , 


T  (k) 

where    E    is  identically  zero  if    Z     S      Z  is 
sufficiently  positive  definite  (Gill  and  Murray, 
1972a) . 

(7)     Determine  the  vector    p^    by  solving 


and  let 


LDL^  P^  =  -Z^  g^'^) 


P  =  ^Pr  +  ^1  ' 


otherwise,  define  a  scaling  factor  as 


Test  whether: 


and  let 

P  =  B2  Yp^  +  Zi^  . 

(8)  Determine  a  step  length,  a,  that  generates 
an  acceptable  reduction  in  the  penalty  function 
P(x,  p),  using  a  safeguarded  cubic  or  parabolic 
step  length  algorithm  (e.g.,  the  procedure  de- 
scribed in  Gill  and  Murray,  1974).     Special  care 
must  be  exercised  in  the  step  length  algorithm  to 
avoid  difficulties  if     P(x,  p)     is  unbounded 

below  along  the  given  search  direction  (see 

Section  3.5.1) . 

^  (k+1)  (k)  , 

(9)  Set     X  to    X        +  ap,  and  return 

to  Step  (1) . 
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2.2.     Barrier  Trajectory  Algorithm 

2.2.1.     Properties  of  the  search  direction 

The  search  direction  of  the  barrier  trajectory 
algorithm  is  (ideally)   constructed  as  the  solution 
of  the  following  quadratic  program: 


IT  T 
QP2:  min  ^  p     Sp  +  p  g 


subject  to  A  p  =  d  , 

where     d.  =  -c .  +  r/X.:  c    denotes  the  vector  of 
111 

constraints  currently  considered  "active";  A  is 
a  matrix  whose  columns  are  the  gradients  of  the 
active  constraints;  A     is  an  estimate  of  the 
Lagrange  multipliers  corresponding  to  the  active 
constraints;  r     is  the  current  value  of  the  barrier 
parameter;  g     is  the  gradient  of     F;  and     S  is 
an  approximation  to  the  Hessian  of  the  Lagrangian 
function . 

It  is  essential  to  achieve  a  reduction  in  the 
barrier  function    B(x,r)     at  each  iteration, 
because  the  barrier  function  serves  as  a  conven- 
ient "merit  function"  for  measuring  progress  toward 
X*.     The  derivation  of  the  barrier  trajectory  algo- 
rithm indicates  that  the  search  direction  given  by 
the  solution  of  QP2  may  not  always  be  a  descent 
direction  for     B(x,r).     Therefore,  the  null-space 
component  of  the  search  direction  may  alternatively 
be  computed  to  minimize  a  quadratic  approximation 
to  the  Lagrangian  function,  independent  of  the 
component  in  the  range  of  the  columns  of    A.  This 
alternative  formulation  of  the  search  direction  is 
necessary  because  of  the  quite  different  roles  of 
the  penalty  and  barrier  parameters  as  the  solution 
is  approached. 

2.2.2.     Calculation  of  the  search  direction 

At  the  beginning  of  the  k-th  iteration  of  the 
barrier  trajectory  algorithm,  the  same  vectors 
and  matrices  are  available  as  for  the  penalty  tra- 
jectory algorithm.     The  iterates  generated  by  the 
barrier  trajectory  algorithm  necessarily  lie  in 
the  strict  interior  of  the  feasible  region;  this 
algorithm  is  intended  for  use  on  problems  where 
some  or  all  of  the  problem  functions  may  be  ill- 
defined  or  undefined  outside  the  feasible  region. 

The  computational  procedures  followed  during 
the  k-th  iteration  are: 


(1)  Determine  the  set  of  "active"  constraint. 

denoted  by     c     (see  Section  3.2);  form  the  matrix 

(k) 

A,  whose  columns  are  the  columns  of    A  corre- 
sponding to  the  active  set.     By  construction,  A  ; 
has     <^  n  columns. 

(2)  Factorize    A    such  that 


QA  = 


Q  Q  =  I  , 

as  before. 

(3)     Determine  the  Lagrange  multiplier  esti- 
mate   A    by  solving: 


so  that     A     is  the  least-squares  solution  of 
minllAA-g^^^  II 2 ;  a  similar  procedure  to  that  given 
for  the  penalty  trajectory  algorithm  is  followed 
if    A    is  rank-deficient.     If  one  or  more  compon- 
ents of     A     are  negative,  the  constraint  corre- 
sponding to  the  most  negative  is  deleted  from  the 
active  set;  the  modified    A    is  then  factorized, 
and  the  new    A     vector  calculated  for  the  re- 
defined active  set.     Since  the  new    A    is  simply 
the  previous    A    with  one  column  deleted,  the  new 
factorization  can  be  obtained  by  a  simple  updating 
scheme  (Gill,  Golub ,  Murray,  and  Saunders,  1974). 

(4)  Determine  the  barrier  parameter,  r 
(Murray  and  Wright,  1976b). 

(5)  Determine  the  vector    d     according  to 
the  following  rules.     Define     s  =  Hell  +  II  z''"g  ^''^^  II  , 
and  set    £  =  y  r/s,  where    7  >  1     (say,   2).  Let 
r    be  the  barrier  parameter  from  the  previous 
iteration;  then: 

r 

if    A.  >  e,  set    d.  =  -c.  +  ^; —  ; 
1  1         1      A , 

1 


if    A.  <  -e,  set    d.  =  -(1  -  — )  c; 
1  —  1  r  1 


otherwise , 


(6)     Compute  the  vector     p  ,  which  is  the 
solution  of  the  linear  system: 


T  T 
A  Yp^=RPj, 
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In  this  way,  p      satisfies  the  desired  linear 
R 

equality  constraints  of  QP2  for  those  problem 
constraints  for  which  is  sufficiently  positive; 

an  alternative  relationship  is  satisfied  for  each 
"active"  constraint  that  has  an  insufficiently 
positive  multiplier  estimate.     Again,   the  rank- 
deficient  case  is  treated  as  for  the  penalty  tra- 
jectory algorithm. 

(7)  Compute  the  modified  Cholesky  factoriza- 
T  (k) 

tion  of     Z     S      Z     (as  in  the  penalty  trajectory 

algorithm) ;  the  factorization  will  be  denoted  by 
T 

LDL  . 

(8)  Determine    p^^    by  solving: 


LDL^  P^  =  -Z^  g^^) 


Test  whether: 


llp^ll  <  Mllpj^ll 


and        llp^ll  <  Mllp^ll    ,  (7) 
R  N 


for    M    a  reasonably  large  positive  number. 

(a)     If  the  test   (7)   is  satisfied,  obtain 
Pjj    by  solving 


LDL^  p^  =  -Z^Cg^''^  +  S^^^  Ypj^)  , 


and  define  the  trial  search  direction  as: 


P  =  Yp^  +  Zp^ 


If  p  is  not  a  descent  direction  for  B(x,r), 
re-define    p  as: 


Yp    +  Zp 


This  latter  definition  is  guaranteed     to  yield 
a  descent  direction  for  B(x,r). 

(b)  If  the  test  (7)  is  not  satisfied,  then 
adjust  the  scaling,  as  in  the  penalty  trajectory 
algorithm. 

(9)     Determine  a  step  length,  a,  that 
accomplishes  a  suitable  reduction  in  B(x,r), 
using  special  procedures  designed  for  one-dimen- 
sional minimization  with  respect  to  the  logarithmic 
barrier  function  (see  Section  3.5.2).     During  the 
search  procedure,  record  whether  violation  of  any 
constraint  currently  considered  "inactive"  restricts 
the  step  length;  if  so,  this  constraint  will  be 


added  to  the  active  set  at  the  next  iteration. 

(k+1)  (k) 
(11)     Set     X  to     X        +  ap,  and  return 

to  Step  (1)  . 


3 .     Some  Considerations  of  Numerical  Analysis  in 
Implementation  of  the  Trajectory  Algorithms 
In  this  section,  we  consider  some  selected 
aspects  of  implementation  of  the  trajectory  methods, 
from  the  viewpoints  of  numerical  analysis  and 
algorithm  definition.     It  will  be  stressed  through- 
out this  discussion  that  an  implementation  could 
not  achieve  practical  success  if  the  definition  of 
the  algorithm  depended  critically  on  properties 
that  hold  only  in  a  close  neighborhood  of     x* ;  the 
algorithm  should  produce  sensible  results,  even 
when  such  conditions  are  not  satisfied  at  the 
current  point. 

3.1.     Selection  of  the  Active  Set  for  the  Penalty 
Trajectory  Algorithm 

The  "active  set"  of  constraints  is  defined 
at  each  iteration  of  the  penalty  trajectory  algo- 
rithm as  the  set  of  constraints  whose  values  are 
less  than  a  specified  small  positive  number.  With 
this  definition,  the  active  set  is  equivalent  to 
the  "violated  set",  and  can  easily  be  determined. 
Such  a  strategy  is  reasonable  because  the  penalty 
trajectory  algorithm  is  based  on  properties  of  the 
quadratic  penalty  function,  and  for  a  sufficiently 
large  value  of     p ,  the  set  of  constraints  violated 
at     x*(p)     is  identical  to  the  set  of  constraints 
active  at     x*     (Fiacco  and  McCormick,  1968).  After 
the  first  few  iterations,  the  active  set  typically 
remains  fixed  for  the  rest  of  the  computation. 

It  was  noted  in  the  definition  of  the  algo- 
rithm that  a  special  procedure  is  used  when  more 
than    n    constraints  are  violated  at  the  beginning 
of  an  iteration.     If  more  than    n    constraints  are 
violated,  and    A    has  full  rank,  the  search  direc- 
tion, p,  is  chosen  to  attempt  to  minimize 

llc(x+p)'l    ,     by  computing    p     as  the  solution  of 
^       '  T  2 

minllc  +  A  pll   .     The  search  direction  in  this  case 


is  calculated  as  follows; 

:t 


(1) 


Factorize  A 
R 


QA 


in  the  form 


T 

Q  Q  =  I 
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where     R    is  upper  triangular  and  non-singular. 

(2)     Solve     Rp  =  -Y  c ,  where  the  columns  of 
Y    are  given  by  the  first     n     rows  of  Q. 

The  computational  procedure  is  very  similar 
to  the  calculation  of  the  Lagrange  multiplier 
estimates,  and  relies  on  orthogonal  transformations 
to  reduce    A      to  upper  triangular  form. 

Normally,  the  condition  that  more  than  n 
constraints  are  violated  occurs  because  the  current 
point  is  a  poor  estimate  of  the  solution,  and 
does  not  hold  at  the  next  iterate.     However,  it  is 
conceivable  that  this  condition  could  exist  even 
at     X*,  so  that  possibly  every  iteration  might  be 
special.     In  this  case,  the  Hessian  matrix  of  the 
penalty  function  is  not  ill-conditioned  as  the 
penalty  parameter  approaches  its  limit.  This 
special  procedure  has  the  same  effect  as  choosing 
p  =        in  the  usual  definition  of  the  algorithm, 
and  hence  is  equivalent  to  the  Gauss-Newton  method. 

3.2.     Selection  of  the  Active  Set  for  the  Barrier 
Trajectory  Algorithm 

The  criteria  for  selecting  the  active  set  in 
the  barrier  trajectory  algorithm  are  not  so 
straightforward  as  in  the  penalty  trajectory  algo- 
rithm.    Because  the  barrier  trajectory  algorithm 
is  a  feasible-point  method,  all  problem  constraints 
are  satisfied  at  every  iteration,  and  the  subset 
of  active  constraints  must  be  determined  by  analysis 
of  the  behavior  of  the  constraints  as  the  computa- 
tion proceeds. 

The  procedure  for  determining  the  initial 
active  set  is  described  in  Murray  and  Wright 
(1976b) ,  and  has  been  satisfactory  on  the  examples 
tested.     The  active  set  tends  to  be  altered  only 
during  the  early  iterations,  because  of  the  possibil- 
ity of  misleading  local  indications  that  certain  con- 
straints are  active.     The  following  rules  are  used 
to  modify  the  active  set  at  each  iteration: 

(1)  The  constraint  corresponding  to  the  most 
negative  Lagrange  multiplier  estimate  (if  one 
exists)   is  deleted,  and  the  remaining  multipliers 
are  modified  accordingly. 

(2)  If  any  constraint  considered  active 
appears  to  be  bounded  away  from  zero  as  the  solu- 
tion is  approached,   it  is  deleted  from  the  active 
set.     This  decision  is  highly  dependent  on  scaling; 
further  discussion  is  given  in  Murray  and  Wright 


(1976b)  . 

(3)  If  the  step-length  algorithm  was  re- 
stricted because  a  supposedly  inactive  constraint 
was  violated,  this  constraint  is  added  to  the 
active  set  at  the  beginning  of  the  next  iteration, 
and  will  not  be  deleted  during  that  iteration, 
regardless  of  the  sign  of  its  multiplier  estimate. 

(4)  If  the  number  of  active  constraints 
exceeds     n     following  the  addition  of  (3),  the  con- 
straint with  the  largest  value  of     c^(x)  is 
deleted  from  the  active  set  (a  further  scaling- 
dependent  decision) . 

3.3.     Use  of  Orthogonal  Factorizations 

A  factorization  involving  reduction  of    A  to 
triangular  form  by  application  of  orthogonal 
transformations  is  used  in  several  steps  of  the 
trajectory  algorithms.     Such  a  factorization  is 
convenient  and  reliable  for  computation,  and  has 
many  advantages  over  alternative  procedures.  For 
example,  in  some  algorithms  for  solving  PI,  the 
matrix    A  A    is  formed  and  used  to  solve  linear 
systems;   the  poor  numerical  properties  of  this 
strategy  are  well-known  (see  Peters  and  Wilkinson, 
1970),  especially  the  possible  squaring  of  the  con- 
dition number  that  may  result  from  the  formation 
"T" 

of    A  A.     Furthermore,  if  the  matrix    A    does  not 

'T" 

have  full  rank,  A  A     is  singular,  and  some  steps 
of  the  algorithm  may  then  be  undefined. 

Computation  of  the  complete  orthogonal  fac- 
torization of    A    means  that  steps  of  the  trajectorj; 
algorithm  can  be  defined  (with  relatively  little 
extra  work)  even  if    A    is  rank-deficient  (see 
Sections  3.3.1  and  3.3.2).     Although  only  the 
singular  value  decomposition  can  fully  reveal  the 
closeness  of  the  columns  of    A     to  linear  dependence 
the  complete  orthogonal  factorization  provides  ade- 1 
quate  information  in  many  applications  (see  Golub, 
Klema,  and  Stewart,  1976),  since  the  conditioning 
of  the  triangular  matrix    R    serves  to  indicate 
the  "conditioning"  of  A. 

3.3.1.     Calculation  of  a  Lagrange  multiplier 
estimate 

The  Lagrange  multiplier  estimate  at  each 

iteration  of  the  trajectory  algorithms  is  computed 

"  2 

as  a  least-squares  solution  of    min  llAA-gil^;  this 
first-order  estimate  is  acceptable,  since  the 
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local  rate  of  convergence  of  the  trajectory  methods 
is  not  restricted  to  the  rate  of  convergence  of 
the  multipliers  (see  Wright,  1976).     The  orthogonal 
factorization  of    A    can  be  used  to  calculate  a 
minimum-length  least-squares  solution,  even  when  A 
does  not  have  full  column  rank;  this  alternative 
is  not  possible  with  techniques  that  involve  forming 
A  A. 

When  reducing    A    to  upper  triangular  form, 
column  interchanges  are  carried  out  so  that  the 
reduced  column  of  largest  magnitude  is  selected 
as  the  next  column  to  be  reduced;  the  matrix  is 
considered  to  be  numerically  rank-deficient  if 
the  norms  of  all  unreduced  columns  are  less  than 
a  prescribed  tolerance.     In  this  way,  all  diagonal 
elements  of    R    are  bounded  below  by  the  specified 
tolerance.     Although  the  ill-conditioning  of  R 
does  not  necessarily  reveal  itself  by  the  presence 
of  a  diagonal  element  that  is  very  small  relative 
to  the  largest  diagonal  element,  prevention  of  a 
too-small  diagonal  element  is  sufficient  in  many 
cases  to  control  serious  ill-conditioning  of  R. 

3.3.2.     Calculation  of  the  search  direction 

The  search  direction  in  both  trajectory 
algorithms  is  computed  in  two  orthogonal  components 
—  one  in  the  range  of  the  columns  of    A,  the  other 
in  the  corresponding  null  space.     This  definition 
results  from  the  characterization  that  the  search 
direction  must  satisfy  a  set  of  linear  equality 
constraints  of  the  form: 

a'^P  =  b  ,  (8) 

where    b     is  some  vector  depending  on  the  algorithm. 
If    A    has  full  rank,  these  equality  constraints 
uniquely  determine  the  component  of    p     in  the 
range  of  the  columns  of    A,  which  is  calculated 
as  follows. 

The  orthogonal  matrix    Q    that  reduces  A 
to  upper  triangular  form  is  explicitly  formed, 
by  multiplying  out  the  orthogonal  transformations 
used  in  the  reduction.     Once     Q    is  available,  its 
rows,  appropriately  partitioned,  provide  the  re- 
quired orthogonal  bases  for  the  range  and  null 
space  of  the  columns  of  A. 

In  the  full  rank  case,  it  is  straightforward 
to  compute  the  component  of    p     in  the  range  of  A. 


Since     p  =  Yp    +  Zp      the  equality  constraints  (8) 
K  N 

imply : 

pjp  =  A^CYpj^  +  Zp^)  =  A^  Ypj^  =  R%j^  =  b  , 

which  gives     p      as  the  solution  of  a  non-singular 

triangular  system. 

If    A     is  rank-deficient,  the  component  p^^ 

may  be  obtained  as  the  least-squares  solution  of 
~  T  2 

min  llA  p  -  bV  ,  again  using  the  complete  orthogonal 

factorization  of  A. 

In  either  case,  the  calculation  of     p  is 

R 

completely  straightforward. 

3.4.     Approximation  of  the  Hessian  of  the  Lagrangian 
Function 

Alternative  techniques  for  approximating  the 
Hessian  of  the  Lagrangian  function  under  various 
circumstances  will  not  be  discussed  in  any  detail 
(see  Murray  and  Wright,  1976b,  for  such  a  discus- 
sion) ,  but  we  shall  consider  one  key  property  of 
the  Hessian  approximation. 

The  assumed  second-order  Kuhn-Tucker  conditions 
T 

imply  that  the  matrix    Z  WZ    must  be  positive 

definite  at    x*,  where     Z     is  defined  in  terms  of 

A(x*) ,  and    W    is  the  Hessian  of  the  Lagrangian 

function;  however,  W    itself  need  not  be  positive 

definite,  or  even  non-singular,  at     x*    or  in  any 

neighborhood  of  x*. 

Certain  approaches  to  solving  PI  impose 

additional  conditions  on  related  matrices  —  for 

example,  methods  involving  augmented  Lagrangian 

functions  (see  Powell,  1969;  Fletcher,  1974) 

require  that  the  penalty  parameter  be  large  enough 

so  that  the  Hessian  of  the  augmented  function  is 

positive  definite.     In  both  trajectory  algorithms, 

T 

however,  only  the  projected  Hessian,  Z  SZ,  must  be 
positive  definite  at  every  iteration,  in  order  for 
the  solutions  of  the  quadratic  programs  QPl  and 
QP2  to  be  bounded.     The  vector    p^^    is  the  solution 
of  a  linear  system: 

z'^'sz  p^  =  z'^d  ,  (9) 

for  some  vector    d,  and  should  be  the  step  to  the 
minimum  of  a  quadratic  function  related  to  the 
Lagrangian  function. 
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Accordingly,  the  matrix  used  to  calculate  p^^ 

is  always  maintained  as  numerically  positive  defi- 
T 

nite.     When     Z  SZ     is  updated  by  a  quasi-Newton 
technique,  positive  definiteness  is  maintained  by 
updating  the  Cholesky  factorization  of  the  pro- 
jected matrix,  as  in  revised  quasi-Newton  methods 

for  unconstrained  minimization  (Gill  and  Murray, 
T 

1972b).     When     Z  SZ     is  obtained  from  exact,  or 
finite-difference  approximations  to,  second  deriv- 
atives, the  modified  Cholesky  factorization  of 
T 

Z  SZ     is  computed  in  order  to  solve  the  system  (9) . 

In  all  cases,  the  matrix  used  to  solve  (9)  for  p^ 

T 

is  represented  as     LDL  ,  where     L     is  unit  lower 
traingular,  and    D     is  a  diagonal  matrix  with  all 
elements  strictly  positive.     Such  a  procedure 
assures  that  the  portion  of  the  search  direction 
in  the  null  space  of  the  columns  of    A     is  always 
well-defined,  and  bounded. 

3.5.     Step-Length  Algorithms 

3.5.1.     Detection  and  correction  of  unbounded 
decrease  of  penalty  function 

Even  when  the  problem  PI  has  a  bounded  solu- 
tion, the  corresponding  penalty  function  or  aug- 
mented Lagrangian  function  may  be  unbounded  below, 
for  arbitrarily  large  values  of  the  penalty  param- 
eter (Powell,     1972).     Accordingly,  when  executing 
a  one-dimensional  minimization  with  respect  to  a 
penalty  function  or  augmented  Lagrangian  function, 
care  must  be  exercised  to  avoid  the  possibility 
of  taking  an  excessively  large  step. 

In  particular,   the  safeguarded  cubic  or 
quadratic  step-length  algorithms   (Gill  and  Murray, 
1974)  used  in  the  penalty  trajectory  algorithm 
require  specification  of  an  upper  bound  on  the 
step  length.     In  the  current  implementation  of 
the  penalty  trajectory  algorithm,  the  upper 
bound  is  set  to  correspond  to  a  step  of  "reason- 
able" size,  rather  than  an  extremely  large  value. 
In  some  cases,   the  upper  bound  may  impose  an 
unnecessary  limit  on  the  stepsize;  however,  in 
general  such  a  restriction  will  cause  no  serious 
loss  of  efficiency  for  the  overall  computation, 
since  the  next  iteration  usually  corrects  the 
possible  poor  scaling  of  the  search  direction. 
This  conservative  strategy  is  considered  to  be 
justified  by  the  extreme  difficulties  that  result 
if  an  enormous  step  is  taken  because  the  penalty 


function  is  unbounded  below:     either  the  next 
iterate  is  completely  unreasonable,  or  a  large 
number  of  evaluations  of  the  problem  functions  are 
required  before  the  unboundedness  is  detected. 

In  the  penalty  trajectory  algorithm,  it  is 
considered  that  the  penalty  function  may  be  unbound- 
ed along  the  given  direction  if  the  step  taken  is 
the  specified  upper  bound.     Almost  always,  the 
indicated  unboundedness  can  be  eliminated  simply 
by  increasing  the  penalty  parameter. 

3.5.2.     Special  techniques  for  the  barrier 
trajectory  algorithm 
At  each  iteration  of  the  barrier  trajectory 
algorithm,  a  step-length  algorithm  is  executed  with 
respect  to  the  logarithmic  barrier  function,  which 
thus  serves  as  a  "merit  function"-     Several  authors 
(Fletcher  and  McCann,  1969;  Lasdon,  et  al.,  1973) 
have  noted  the  deficiencies  of  standard  step-length 
algorithms,  which  are  usually  based  on  approximation 
by  low-order  polynomials,  when  applied  to  the  log- 
arithmic barrier  function.     Therefore,   the  step- 
length  algorithm  of  the  barrier  trajectory  method 
makes  use  of  special  techniques  that  exploit  the 
known  properties  of  the  logarithmic  barrier  function 
to  allow  more  efficient  estimation  of  an  appropriate 
step  length;  these  techniques  are  based  on  simple 
approximating  functions  that  contain  a  logarithmic 
singularity.     Only  a  small  amount  of  additional 
calculation  is  required  to  fit  the  special  approx- 
imating functions,  and  their  use  leads  to  a  signif- 
icant increase  in  efficiency  of  the  one-dimensional 
minimization,   compared  to  standard  procedures 
(Murray  and  Wright,  1976a). 

4 .  Conclusions 

The  penalty  and  barrier  trajectory  algorithms 
are  based  on  the  mathematical  properties  of  the 
approach  to    x*     of  the  successive  iterates  generated 
by  the  quadratic  penalty  function  and  logarithmic 
barrier  function,  respectively.       In  theory,  these 
algorithms  have  several  desirable  properties  —  for 
example,  their  derivation  does  not  depend  on  condi- 
tions that  hold  only  in  a  close  neighborhood  of  x*, 
and  their  rate  of  convergence  in  the  limit  is  arbi- 
trarily close  to  that  of  linearly  constrained 
Lagrangian  algorithms  (described  in  Robinson,  1972; 
Rosen  and  Kreuser,  1972).     In  practice,  the  current 
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Lmplementatlons  of  the  trajectory  algorithms  have 
aeen  successful  on  many  problems,  deliberately 
including    examples  for  which  the  ideal  assumptions 
are  violated.     The  results  thus  far  indicate  that 
these  algorithms  compare  favorably  with  similarly 
:areful  implementations  of  other  algorithms  to 
solve  PI  (see  Wright,  1976,   for  some  typical 
lumerical  results) . 

The  overall  aim  of  this  paper  has  been  to 
Illustrate  some  of  the  considerations  of  numerical 
analysis  that  enter  the  choice  of  computational 
procedures  for  selected  aspects  of  the  trajectory 
algorithms.     Numerical  analysis  may  not  play  a 
significant  role  in  the  process  of  verifying  that 
:he  expected  behavior  of  an  algorithm  under  ideal 
;onditions  is  displayed  numerically.  However, 
Lt  is  an  elementary  fact  of  numerical  analysis 
:hat  theoretically  equivalent  mathematical  pro- 
:edures  do  not  yield  equivalent,  or  even  close, 
lumerical  results,  and  it  is  an  elementary  fact 
jf  life  that  hoped-for  conditions  are  not  always 
satisfied.     Considerations  of  numerical  analysis 
should,  therefore,  be  applied  to  every  aspect  of 
:he  definition  and  implementation  of  optimization 
algorithms  in  general. 
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Abstract 

The  problem  considered  is  the  calcula- 
tion of  the  least  value  of  a  general  dif- 
ferentiable  function  of  several  variables . 
A  brief  review  of  two  types  of  minimiza- 
tion methods  based  on  nonquadratic  models 
is  offered.  The  first  involves  solving 
systems  of  linear  equations  in  every  it- 
eration. The  second  is  derived  from  a 
generalization  of  the  standard  conjugate 
gradient  methods. 

I .  Introduction 


The  problem  to  be  discussed  is  the 
computation  of  the  unconstrained  minimum 
value  of  a  general  dif f erentiable  multi- 
variable  function  f  (x) .  Most  of  the  cur- 
rently available  algorithms  use  as  a  fun- 
damental model  the  quadratic  function 


(1) 


where  Q  is  a  positive  definite  matrix  and 
^  is  the  location  of  the  minimum  solution 
of  (1) .  It  is  of  interest  to  investigate 
more  general  models  which  may  better  re- 
flect the  local  behavior  of  general  con- 
tinuous functions.  It  would  be,  for  in- 
stance, desirable  to  develop  methods  that 
can  successfully  handle  cases  where  the 
Hessian  matrices  are  positive  semi-defi- 
nite at  the  solution  or  elsewhere. 

An  interesting  departure  from  model 
(1)   can  be  accomplished  if  we  assume  that 


f  (X)    =  F(q(x)  ) 


(2) 


where  F  is  a  dif f erentiable ,  strictly  in- 
creasing function  of  a  single  variable 
q>0.   The  gradient  of  f (x)  is 


g(x)   =  F'Q(x-x) 


(3) 


where 


dg 


(4) 


and  since  F'>0  the  minimum  of  f (x)  takes 
place  at  x=x.  Furthermore,  f (x)  can  be 
minimized  in  at  most  n  steps  if  we  use: 


(a)   a  set  of  search     directions  mu- 
tually conjugate        with       regard  to 


Q,   i.e.  ,dQ,dj^,  .  .  .  ,d^_^ 


satisfying 


d^Qdj  =  0,i  7^  j    ,  and, 

(b)    the  optimal  steps     X.  satisfying 
the  equations 


Si+1 


x.+A .d. 
~i  i~i 


(5) 


T 

d!g.^,    =  0 


(6) 


for 


i  =  0,1,2,3, 


,n-l 


This  property  of  F{q(x))  becomes  obvious 
if  we  make  the  observation  that  the  non- 
linear function  F{q(x))  leaves  the  iso- 
contour  curves  of  q(x)  unchanged,  alter- 
ing only  the  function  values  on  these 
curves . 

Thus,  the  function  f(x)  =  F(q(x))  can 
be  minimized  in  a  finite  number  of  steps 
if  we  generate  Q-conjugate  search  direc- 
tions in  the  process  of  minimizing  f{x). 
Obviously,  a  standard  conjugate-gradient 
algorithm  such  as  Fletcher-Reeves'  or  Da- 
vidon-Powell-Fletcher ' s  will  not  be  fi- 
nite (unless  F(q)  =  q)  since  the  gradi- 
ent vectors  of  F(q(x))  are  multiples  of 
the  gradients  of  q(x)  and  this  would  al- 
ter the  search  directions  generated  by  a 
standard  conjugate  gradient  algorithm. 
However,  a  relatively  simple  modification 
of  these  algorithms  will  provide  this 
property . 

A  pioneering  work  in  this  direction  has 
been  done  by  Fried  [1971]  who  has  con- 
structed a  modified  Fletcher-Reeves  meth- 
od for  minimizing  f (x)   of  the  form 


ir 


(7) 


where  f  is  a  constant  value. 

Function    (7)  satisfies  the  relationship 


f(x)   =  |^(x-x)^g(x) 


(8) 


and  has  also  been  studied  as  an  optimiza- 
tion model  by  Jacobson-Oksman  [1972]  and 
Kowalik-Ramakrishnan    [1976]    .     They  have 


205 


used  equation  (8)  as  a  basis  for  an  opti- 
mization method  which  does  not  produce 
conjugate     directions     but  is  finite  for 

(7).  This  method  is  briefly  described  in 
Section  2.  A  similar  class  of  functions 
has  been  also  studied  by  Davison  and  Wong 

[1974]  .  Section  3  shows  a  derivation  of 
modified  conjugate  gradient  methods  mini- 
mizing f (x)  =  F(q(x))  and  a  detailed  a- 
nalysis  of  f  (x)  =  -^qP  .  Section  4  contains 
numerical  results  and  Section  5  is  a  con- 
clusion . 

2.  A  method  based  on  the  homogeneous  mo- 
del 


We  can  rewrite    (8)    in  the  form 

1  AT  A 

f(x)   =  ^(x-x)  g(x)+f 

where  y  =  2  for  function   (1) . 
If  we  define 

a     =   (x  ,Y,f) 


f  =  Yf 


=    (g(x)'^,f  (x)  ,-1) 


V 


T 

X  g(x) 


(9) 


(10) 


(11) 


(12) 


(13) 


one  row  from  Y-_i  and  inserting  the  new 
row  y  (x  .  )  computed  at  x-^.  The  vector  w.j_ 
is  also  modified  by  one  component.  The 
method  requires  solving  systems  of  equa- 
tions which  differ  by  one  row.  Solutions 
to  such  systems  can  be  obtained  inexpen- 
sively and  accurately  by  using  different 
factorization  methods  (Kowalik-Ramakr ish- 
nan    [1976]    and  Kowalik-Kumar    [1976]  ). 

3.   A  modified  conjugate-gradient  method. 

In  the  method  of  conjugate  gradients 
of  Fletcher  and  Reeves  [1974]  the  search 
direction  is  calculated  as  a  sum  of  two 
vectors , 


and 


^i  =  -Si+^iSi-i'i^i 


(17) 


(18) 


The  scalar  multiplier  B-  can  be  deter- 
mined from  the  conjugacy  condition 


dty.    ,   =  0 


(19) 


where 

^i-1  =  Si-2i-i  =  Q(?Si-JSi-i)  (20) 

Equations    (19)   and    (20)  give 


then  equation    (9)    can  be  restated  as: 

V  (14) 


T 

y  St 


If  we  compute  X  and  v  at  n+2  distinct 
points  (n  is  the  size  of  the  vector)  we 
get  a  system  of  linear  equations 


T 

,  gi^i-l 

d  .    T  y  .  , 


and  from  (17) 


(21) 


Ya  =  w 


(15) 


where 


Y  = 


T 
^2 


■n+2 


n+2 


and  if  Y  is  nonsingular  we  can  solve  (15) 

for       which  contains  the  solution     to  the 

minimization  problem  min  f (x) . 

55  ~ 

Clearly,  it  is  possible  to  get  an  al- 
gorithm minimizing  functions  satisfying 
(8)  in  n+2  steps.  For  general  functions 
an  iterative  procedure  can  be  constructed 
where  a  sequence  of  approximation^ 
X  ,  .  ^  .  , x^ ,  .  .  .  is  generated  such  x^->-x 
wnefe  x*  is  a  local  minimum.  In  this  pro- 
cedure, equation  (15)  becomes  iterative 
Y^St-   =  ^-   and  Y^  is  obtained  by  removing 


T 

,  gi^i-l 

^i  =  -Hi  If  

Si-l^i-l 


'i-1 


(22) 


Equation  (22)  is  still  suitable  in  the 
case  when  f (x)  =  F(q(x))  except  that  we 
have  to  redefine  ^^'^  seen 

from   (22)   that  if  we  define  X^.^^  as 


^i-1  =  Si/^l-Si-i/^1-1 


(23) 


then  the  search  directions  generated  by 
(22)  will  be  collinear  for  any  F(q(x)). 
In  other  words,  if  (23)  is  used  in  equa- 
tion (22),  then  d^ , d  d d^_^ 
are  scaled  conjugate  gradient  directions 
produced  by  the  standard  Fletcher-Reeves 
method  for  F(q(x))   =  q(x). 


Now 


^     ^  gi^Pigi-gi-l^ 

'  ^I-i(Pi9i-§i-i' 


(24) 


where 
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f: 
1 


Equation    (24)   can  also  be  written  as: 

II      I  1  2 


(25) 


ISi-1 


Pi 


(26) 


g . X .    T -g . X . 


T 

A  .   ,  g  ■  d .  , 


0  (35) 


and  from    (31)   and  (33) 


or 


i-l 


^i-lgl-lgi-l^^Pfj-l 
2pf^ 


(36) 


The  modified  formulas  (24)  and  (26)  to 
compute  can  be  useful  if  we  know  how 
to  compute  P^.  Clearly  p.^  depends  on  the 
choice  of  F(q).  Since  available  numerical 
results  suggest  that  the  function 
pq(x)P  may  be  a  good  model  for  uncon- 
strained optimization  we  will  assume  now 
that 


T 

X  .    T  d  .    ^  g  .  ^ 

=  t  ^-1:^1-^-1  +1  (37) 


2f 


i-l 


Introducing  an  abbreviated  notation  for 
the  coefficients  of    (37)   we  get 


f  (x)   =  F(q(x)  ) 


p>o 


The  gradient  of  f(x^)  is 


=  F|Q(x^-x) 


(27) 


(28) 


and 


F!    =  q  =  (pf,)P^ 


(Pf^) 


1-t 


where 


Furthermore 


t  = 


1 
P 


(29) 


(30) 


i-l 


F! 

1 


^i-l 


t-1 


(31) 


and  we  assume  that  t  is  an  unknown  value 
that  has  to  be  determined  at  each  step  of 
the  iterative  process.   From   (28)   we  have 


n-1 


Si-l^^i-S' 


F '.  T 

gi 

(Xi_i-x) 

T 

T  A 

-2i-i^ 

T 

Si^Si-i 

Ta 

-Si^ 

and 


n-1 

f: 

X 


since 


T  T 
gi-l^i-gi-lgi-l+^P^i-l 

gi5i-rglsi+2pf . 


Ta  T       „  ^ 

g^x  =  giX,-2pf. 


(32) 


(33) 


(34) 


a.   =  tb.+l 
1  1 


(38) 


Equation  (38)  (first  time  obtained  by 
Fried  [1971])  is  solved  for  a  nontrivial 
unique  root  t^  which  in  turn  is  used  to 
calculate  Pi  from  equation  (31). 

As  indicated  by  Goldfarb  [1972]  the 
Quasi-Newton  methods  can  also  be  modified 
to  minimize  f (x)  =  F(q(x))  in  a  finite 
number  of  steps.  The  effect  of  nonline- 
arly  scaling  the  objective  function  on 
the  Quasi-Newton  methods  has  been  also 
investigated  by  Spedicato   [1976]  . 

Computational  results 

The  extended  Fletcher-Reeves  method 
(EFR)  using  formulas  (26)  and  (31) ,  where 
t  is  calculated  by  solving  (38),  has  been 
tested  on  several  standard  functions  and 
compared  with  the  classical  Fletcher- 
Reeves  method  (FR) .  An  identical  one- 
dimensional  search  routine  has  been  used 
in  both  methods.  The  programs  have  been 
written  in  FORTRAN  and  computations  per- 
formed in  double  precision  on  IBM/360. 

The  following  problems  have  been  tried. 

Test  problem  1    (Quartic  with  singular 

Hessian) 

f  (x)  =  (x^+10x2 )  ^+5  (X3-X^ )  ^+  (X2-2X3 )  ^+10  (x-^-x^ )  ^ 

The  starting  point  is  (3,-1,0,1)  and  the 
function  '  has  a  minimum  of  zero  at 
(0,0,0,0) . 

Test  Problem  2    (homogeneous  quartic) 

f(x)   =    [!5x''^Qx+b'^x+0.25]  ^ 


Q  = 


'4.5 

7 

3.5 

3' 

-0.  5~ 

7 

14 

9 

8 

-  1 

3.5 

9 

8.5 

5 

-1.5 

3 

8 

5 

7j 

0 

But 


The  starting     point  is      (4,4,4,4)   and  the 
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function  has  a  minimum  of  zero  at 
(.5, -.5, .5,0) . 

Test  problem  3    (Rosenbrock ' s  function) 

f{x)    =  100(x^-X2)^+(l-x^)^ 

The  starting  point  is  (-1.2,1)  and  the 
function  has  a  minimum     of  zero  at  (1,1). 

Test  problem  4  (a  Hilbert  quadratic 
form) 

f  (x)   =     ^x'^Hx  ^,       k=2,  3 

where 

h..   =  ^—h — -,       i ,  j  =  l ,  2  ,  .  .  .  ,  5 
ij       i+j-i  'J      '    I  I 

The  starting  point  is  (-3 , -3 , . . . , -3) .  The 
solution  is  at  (0,0,...,0). 

Test  problem  5  ( 4-dimensional  Rosen- 
brock's  function) 

f(x)   =  100 (x^-X2) ^+(l-x^) ^+90 (x^-x^) ^ 

+(1-X3)^+10.1[ (x^-l)^+{x^-l)^] 

+  19.8  (X2-I)  (x^-1) 

The  function  has  a  minimum  of  zero  at 
(1,1,1,1).  Table  I  shows  the  experimental 
results  for  test  Problem  I  which  is  a 
quartic  function  whose  Hessian  is  singu- 
lar (has  two  zero  eigenvalues)  at  the 
solution  xT=  ( 0  ,  0  ,  C) ,  0 )  .  It  looks  as  in 
such  cases  f (x)  =  pqP  is  a  better  optimi- 
zation model.  This  may  be  due  to  the  fact 
that  f (x)  =  IqP  has  a  singular  Hessian  at 
the  solution  for  p>l. 

As  expected,  the  EFR  method  converges 
after  approximately  n  steps  for  the  test 
problems  2  and  4  (Table  III,  rows  3  to 
6) .  It  should  be  pointed  out  that  in  or- 
der to  achieve  convergence  for  these 
functions  in  approximately  n  steps,  the 
EFR  method  had  to  be  implemented  with  an 
accurate  one-dimensional  search.  More  in- 
teresting are  tests  involving  general 
functions,  such  as  ROSENBROCK  and  4-DI- 
MENSIONAL ROSENBROCK.  In  these  tests  we 
have  also  implemented  the  EFR  method  with 
an  accurate  one-dimensional  search.  Our 
objective  has  been  to  find  if  EFR  could 
generate  better  search  directions;  that 
is,  converge  in  fewer  iterations  with 
the  optimal  step  size  strategy. 

Tables  III  and  IV,  and  row  3  of  Table 
II  show  sample  runs.  We  have  had,  how- 
ever, a  couple  of  cases  where  EFR  has 
been  only  marginally  better  or  slightly 
outperformed  by  the  FR  method. 

In  most  test    cases     the  value  of       p . 

i 

differed  significantly  from  1  in  the  ini- 
tial stage  of  optimization  and  approached 
1  in  the  final  phase  of  convergence.  For 
example,  the  run  shown  in  Table  III  p. 
varied  from  138  to  .572  between  the  se- 
cond and  the  twelfth  iteration  and  was 
quite  close  to  1  afterwards.  This  was  not 
the  case  with  Problem  1  where  even  at  the 


very  end  of     the  convergence     process     p . 
assumed  values  as  large  as  15. 

Conclusion 

The  preliminary  results  presented  in' 
this  paper  suggest  that  the  EFR  method ; 
based  on  the  model  f  (x)=^  (x)P,p>0  may, 
deal  better  with  functions  whose  Hessian : 
is  singular  at  some  points  along  the  op- 
timization path  or  at  the  solution. 

There  is  no  strong  indication  that  MFR 
based  on  this  model  can  perform  signifi- 
cantly better  that  FR  when  optimizing  gen- 
eral nonlinear  continuously  differenti- 
able  functions. 

Another  open  problem  is  the  influence 
of  the  one-dimensional  search  accuracy  on 
the  performance  of  the  EFR  method. 

And  the  final  comment.  It  would  be  in- 
teresting to  design  other  methods  based 
on  the  model  f{x)=F(q)  where  F'(q)>0  for 
q>0  and  see  of  they  offer  any  advantage 
over  f (x) =q . 


TABLE  I 


Iteration 
Number 

FR 

EFR 

0 

.2150  X  10^ 

.2150  X  10^ 

10 

.4793  X  10~^ 

.  1403  X  lO-""- 

20 

.1715  X  10"^ 

.9883  X  10"^ 

30 

.1882  X  lO"^ 

.5172  X  10--'-° 

50 

.7786  X  lO"^ 

Quartic  with  singular  Hessian 


TABLE  II 


Test 
Function 

FR 

EFR 

Solution 
Accuracy 

Homogeneous 
Quartic 

29*, 203** 

5*, 43** 

10-11 

Rosenbrock 

46*, 100** 

38*, 86** 

10-^ 

Hilbert 
k=2 

20* 

4* 

10-11 

Hilbert 
k=3 

25* 

4* 

10-1^ 

f{x)  =  q:L 

64* 

7* 

10-^ 

f(x)  =  q;L 

52* 

6*  .69476 

*Number  of  iterations 

**Number  of  function  and  gradient 

evaluations 

***q  is  a  quadratic  function     defined  in 
problem  2 
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TABLE  III 


Iteration 
Number 

FR 

EFR 

0 

. 19192*10^ 

. 19192*10^ 

12 

. 1482*10° 

.7080*10"^ 

20 

. 3491*10"^ 

. 44196*10"^ 

25 

. 1009*10"^ 

. 1509*10"^ 

30 

.  7474*10"'' 

.  9292*10~"'""'" 

1-DIMENSIONAL  ROSENBROCK 
<J  =  (-3,-1,-3,-1) 

TABLE  IV 

Iteration 
Number 

FR 

EFR 

o 

.  4166*10^ 

. 4166*10^ 

20 

.  8421*10-'- 

.7821*10° 

60 

. 1624*10° 

.  5787*10"''" 

80 

.2568*10""'" 

.33923*10"^ 

4-DIMENSIONAL  ROSENBROCK 
=  (-1.2,1,1.2,1) 
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Abstract 

Algorithms  are  given  for  the  efficient  solu- 
tion of  the  class  of  nonlinear  integer  programs  with 
separable  convex  objectives  and  totally  unimodular 
constraints.    Because  of  the  special  structure  of 
this  problem  class,  the  integrality  constraints  on 
the  variables  can  be  easily  handled.    In  fact,  the 
integrality  constraints  actually  make  the  problem 
"easier"  than  its  continuous  version,  for  in  the 
case  that  bounds  are  available  on  the  problem  vari- 
ables, the  first  of  the  proposed  algorithms  yields 
the  optimal  solution  by  the  solution  of  a  single, 
easily-generated  linear  program.    For  the  cases  in 
which  bounds  are  not  available  for  the  variables  or 
the  sum  of  the  variable  ranges  is  very  large,  other 
algorithms  are  discussed  that  yield  the  solution 
after  a  finite  number  of  linear  programs  and  require 
less  storage  than  the  first  algorithm. 

1.  Introduction 


(1.1) 


Consider  the  nonlinear  integer  program 
n 

i=l   '  ' 


min 

X 


s.  t. 


Ax  =  b,  X  >  0,  X  integer 


(Xp  .  .  .  ,x^)  ,  each   f.  i 


IS  convex  on  a 


where  x 

closed  interval  B,  c  [0,+co)   and  A   is  totally 


When  modified  in  the  obvious  fashion,  the  algo- 
rithm to  be  described  below  can  be  used  to  solve 
more  general  problems  in  which  (1)  the  constraints 
Ax  =  b  are  replaced  by  a  system  of  equations  and 
inequalities  with  a  totally  unimodular  constraint 
matrix,  and  (2)  integer  upper  and  lower  bounds  on 
individual  variables  are  added.  Alternatively, 
such  constraints  can  be  converted  into  the  form 
(1.1)  through  the  addition  of  integer  slack  variables, 
and  the  resulting  coefficient  matrix  will  be  totally 
unimodular. 


Supported  in  part  by  the  National  Science  Founda- 
tion under  Contract  No.  DCR74-20584  and  in  part 
by  the  United  States  Army  under  Contract  No. 
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unimodular  mX  n  matrix.  The  intervals  B^,  which 
are  assumed  to  be  given,  are  assumed  to  cover  the 
feasible  set  of  (1.1)  in  the  sense  that  n  =  {x|Ax  = 
b, x>0,x  integer }  C  {x|  x^e  Bj^;  i=l, .  .  .  ,  n  }.  (No  dif- 
ferentiability or  continuity  properties  are  needed  or 
assumed  for  the  f-,  but  convexity  of  f^  on  Bj^  im- 
plies continuity  on  the  interior  of  B^^.  ) 

Problems  of  the  class  (1.1)  arise  in  logistic 
and  personnel  assignment  applications,  and  have 
been  the  subject  of  a  number  of  studies  [1],  [4],  [6]„ 
[7],  [8]  that  deal  with  the  special  case  in  which 

n 

Ax  =  b   consists  of  the  single  constraint  T  x,  =  K 

1^=1  ^ 

(see  also  [5]  for  an  algorithm  for  this  special  case). 
Dantzig  [3,  p.  498]  considers  the  case  in  which  the 
constraints   Ax  -  b  correspond  to  supply-demand 
constraints  in  a  bi-partite  network. 

The  problem  class  (1.1)  has  the  remarkable 
property  (shown  in  [5])  LlTdt  Lhe'integrality  constraints 
actually  make  the  problem  easier  to  solve  than  its 
corresponding  continuous  version. 

Specifically,  it  was  shown  in  [5],  that  if 
there  exist  known  non-negative  integers   ^j^jU^  such 
that   Bj^  =  for  i  =  1,  .  .  .  ,n  (i.  e.  ,  there  exist 

known  bounds  for  the  feasible  set  of  (1.1)),  then 
(1.1)  may  be  solved  by  solving  the  single  linear  pro- 
gram constructed  by  (1)  replacing  each   f^   by  an  ap- 
propriate piecewise  linear  "approximation"  and  (2) 
deleting  the  integrality  constraints.    It  was  also 
shown  in  [5]  that,  if  the  intervals   Bj^   are  infinite 
(or  if  the  sum  of  the  ranges  of  the         is  very  large), 
an  algorithm  based  on  the  solution  of  a  finite  number 
of  similarly  constructed  linear  programs  is  guaran-  : 
teed  to  yield  the  optimal  solution  within  a  finite  num 
ber  of  iterations,  under  the  assumption  that  the  given 
problem  of  class  (1.1)  has  an  optimal  solution.  In 
this  report  we  consider  in  more  depth  the  question 
of  the  trade-offs  between  number  of  LP's  solved, 
storage  required,  and  number  of  function  evaluations 
for  four  specific  algorithms  of  the  general  types  de- 
scribed previously.    A  numerical  example  is  discus-, 
sed  in  Section  4. 
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2.    Basic  Algorithms 

In  order  to  describe  in  a  compact  manner 
algorithms  for  the  problem  (1.1)  we  will  introduce 
some  notation  to  represent  certain  piecewise-linear 
approximations  to  (1. 1).    Letting       (i=l, .  .  .  ,  n)  be 
finite  sets  of  non-negative  integers,  we  define  a 
linear  programming  approximation  to  (1.1)  as  the 
problem  P(Rp...,R^)   given  by: 


V  V 


mm  ^ 
x,X  i=l  j, 


R, 


f.(j)>^.  . 
1      1,  J 


s.  t.    Ax  =  b,    X  >  0 


(2.1) 


j€  R, 


^    ^.   .  =  1  (i-l,. 


,n) 


V  i,j 


k    .    >  0 
1,3  - 

The  problem   P(Rp.  •  •  ?  R^)   can  be  thought  of  as  a 
separable  programming  approximation  to  (1.1)  in 
which  the  integrality  constraints  are  deleted  and 
the  breakpoint-sets  are  given  by  the  sets  R, 
(i=l, .  .  .  ,  n) .    The  significance  of  this  LP  approxi- 
mation, as  shown  in  [5],  is  that  (1)  an  extreme  point 
of  ( 2 . 1 )  will  ha ve  x •    integer-valued  for  i  =  1 , .  .  .  ,  n , 
and  ( 2)  if  x*(i=l,  .  .  .  ,  n)  is  part  of  an  optimal  solu- 
tion (x*,\*)  of  (2.1)  and  the  breakpoint-sets 
have  the  property  that  (for  i=l, .  .  .  ,n) 

(2.2)     [{x*-l,x'\x*+l}  n  B.l  C  R,  , 

then  an  optimal  solution  to  (1.1)  is  obtained  by  set- 
ting  X.  =  x^"  (i=l,  .  .  .  ,  n).    (It  should  be  noted  that 
it  is  essential  to  employ  a  "separable  programming 
approximation"  to  (1.1),  since  other  piecewise- 
linear  approximations  that  agree  with   f^   at  the 
integer  points  in   B^(i=l, .  .  .  ,n)  may  not  have  the 
required  extreme  point  integrality  property  -  see 
[5].) 

We  will  now  consider  three  algorithms  for 
(1.1)  corresponding  to  three  different  procedures  for 
constructing  the  R.. 

Algorithm  1  (Single  LP,  maximal  R-) 

This  algorithm  is  applicable  only  if  there 
are  known  bounds   ^  jjU.  (which  will,  without  loss 
of  generality,  be  assumed  to  be  non-negative  inte- 
gers) such  that  B.  =  [f^u.],  i.e.  if  x  is  feasible 
for  (1. 1),  then   ^^ix^fu^   for  i=l,...,n.  It 
has  the  advantage  that  it  yields  a  solution  to  (1.1) 
via  the  solution  of  a  single  easily-constructed 
linear  program.    Specifically,  let  each   Rj^  con- 
sist of  the  integers  in  the  interval   [^^jU.],  and 
solve  the  LP  (2.1),  then  note  that  the  optimality 
condition  (2.  2)  is  satisfied  by  x'''  if  (x*,y*)  is 
an  optimal  extreme  point  of  the  LP  (if  (2.1)  is  in- 
feasible,  then  so  is  (1.1)).  ■ 

Algorithm  1  provides  a  very  efficient,  one- 
step  approach  to  the  solution  of  (1. 1)  as  long  as  the 
problem   P{[l^,Uy],...,[li^,u^]),  which  will  have 
n 

2n  +  y,(u.-l.)  variables  and   m  +  2n  constraints, 


does  not  exceed  the  capacity  of  the  available  linear 
programming  algorithm.  In  this  regard,  it  should  be 
noted  that  many  commerical  LP  "packages"  have 
separable  programming  and/or  generalized  upper 
bound  capabilities  that  take  advantage  of  the  special 
structure  of  (2.1).  It  is  also  possible  to  modify  the 
data  handling  in  the  simplex  algorithm  so  that  a  dis- 


tinct column  is  not  needed  for  each  \ 


since,  for 


a  given   i  ,  the  columns  corresponding  to  the  \ 
can  differ  only  in  two  entries.    Furthermore,  if  the 
constraints   Ax  =  b   can  be  given  a  network  interpre- 
tation, efficient  algorithms  for  network  optimization 
can  be  applied  rather  than  the  ordinary  simplex  algo- 
rithm (see,  for  example,  [2]). 

However,  if  known  bounds  fj^^Uj^   are  not 
available  or  if  the  attempted  implementation  of  Algo- 
rithm 1  would  lead  to  storage  problems,  then  alterna- 
tive approaches  are  possible  because  the  optimality 
conditions  (2.  2)  do  not  require  a  "full  grid"  of  points. 
The  next  two  algorithms  take  advantage  of  this  fact. 
To  get  starting  "grids"  for  the  next  two  algorithms, 
two  cases  should  be  considered:   (1)   if  bounds  i^,^^ 
are  known,  then  the  initial  breakpoint-sets   r(.o)  ' 
needed  for  the  algorithms  can  be  taken  as  any  inte- 
ger sets  containing  at  least   f  ^   and   u-,  and  (2)  if 
finite  bounds  are  not  available  for  all  variables,  let 
x   be  an  extreme  point  of   {x|Ax=b,x>  0}  (such  an 
X  may  be  determined  by  the  simplex  method,  and  it 
will  be  integer),  and  set  R^*^^  to       ■,Xi,u^},  where 
i .    and       (i=l, . 


n)  are  estimates  (which  need  not 


be  rigorous)  for  lower  and  upper  bounds  on  an  opti- 
mal solution  of  (1. 1).    (If  a  good  guess  is  available 
for  the  optimal  solution  of  (1.1),  the  corresponding 
breakpoints  should  also  be  included  in  the  r(P).  ) 
In  both  cases,  the  initial  LP  considered  will  be 
feasible  if  and  only  if  (1. 1)  is  feasible,  so  we 
will  assume  in  the  algorithms  below  that  feasi- 
bility has  been  established. 


Algorithm  2  (Multiple  LP's,  intermediate  R^'s) 


At  the   kth  iteration  (k=0.1,  2, 

(k) 


we  as- 


sume that  a  feasible  solution  x      to  (1.1)  has  been 

( k)  ( k) 

obtained  by  solving  the  problem    P(R^    , .  .  .  ,  R^  ). 

If  the  optimality  conditions  (2.  2)  are  satisfied  by 

*       (k)  (k)  (k) 

x    =  X       and   R.  =  R.     ,  then  terminate  with  x 
1       1  ' 

as  optimal  solution  of  (1.1).  Otherwise  obtain  new 
breakpoint-sets   R^.'^"'"^'   by  adding  to  each   R*.'^)  the 

points  of  [{x*'^'-l,x^'^',x"^'+l]}  n  B.]\r'^',  (these 
are  the  points  "missing"  from  the  optimality  condi- 
tions) and  let  the  (k+l)st  iterate  x^'^"'"-^'  be  the  value 


,(k+l) 
1 


(k) 


of  X  in  an  optimal  extreme  point  of  P(R 
^(k+1)^  ,  (k+1) 

R^      )  (x  will  be  integer). 

We  also  assume  that  the  algorithm  tests  x' 

for  optimality  in   P(r''^^^^,  .  .  .  , R^'^"'"'''')  and  terminates 

(k)  " 
by  declaring  x       optimal  for  (1.1)  if  this  test  is 

satisfied. 
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(k)  (k) 

(Since  (x     ,\     )  would  normally  be  used 

as  a  starting  basic  feasible  solution  for  P(r''^"'"'^', 
(k+1) 

.  .  .  ,  ),  this  is  a  natural  assumption.  )  ■ 

Algorithm  3  (Multiple  LP's,  minimal  R.'s) 

Proceed  as  in  Algorithm  2,  except  that  if 
x**^^   and  the  R^*^'   do  not  satisfy  the  optimality 
conditions  of  (2.  2),  then  the  breakpoint-sets  r'.'^''"^' 
for  the  next  iteration  are  taken  to  be  the  sets 

{x*^'-l,x*^',x''^'+l}  n  B.  (i=l,.  .  .  ,n).  n 
^  I  11       '  1 

Algorithms  2  and  3,  which  represent  oppo- 
site ends  of  the  column-generation  spectrum,  will 
converge  to  an  optimal  solution  of  (1.1)  in  a  finite 
number  of  iterations,  provided  that  (1. 1)  actually 
has  an  optimal  solution.    This  result  follows  from 
a  finiteness  theorem  [5]  for  integer  programs  with 
separable  convex  objective  functions.    These  algo- 
rithms also  in  general  avoid  the  problem  generation 
and  storage  problems  that  may  occur  in  some  cases 
in  Algorithm  1.    In  certain  instances,  however,  dif- 
ficulties with  storage  limitations  and/or  speed  of 
convergence  might  also  arise  with  Algorithms  2  and 
3,  and  some  additional  computational  refinements 
are  described  in  the  following  section. 

3.    Some  Computationally  Useful  Modifications 

Although  the  algorithms  of  the  previous  sec- 
tion are  guaranteed  to  display  either  one-step  or 
finite  convergence  under  rather  weak  hypotheses, 
theoretical  finite  convergence  of  course  does  not 
necessarily  imply  that  an  optimal  solution  will  be 
obtained  within  the  time  or  storage  available  to 
solve  the  problem. 

If  Algorithm  1  can  be  employed  without  ex- 
ceeding the  capacity  of  the  available  LP  or  net- 
work code,  then  time  and  storage  will  not  be  prob- 
n 

lems  unless    T  (u,-  f  .)  turns  out  very  large.  On 
.^11 

the  other  hand,  Algorithm  2,  starts  out  with  rela- 
tively small  breakpoint-sets,  but  there  is  no  con- 
trol on  the  size  of  these  sets,  and  thus  no  guar- 
antee that  problem  size  limits  might  not  be  ex- 
ceeded. 

Although  Algorithm  3  employs  the  minimal 


breakpoint-sets  needed  to  establish  optimality, 
one  would  not  expect  it  to  be  very  efficient,  since 
the  value  of  a  variable  can  change  by  at  most  one 
unit  at  each  iteration,  and  since  much  of  the  com- 
puted information  on  values  of  the        may  be  dis - 
carded  at  each  iteration.    In  addition,  slow  con- 
vergence might  also  occur  in  Algorithm  2  in  the  case 
that  lower  and  upper  bounds  are  not  known  and  at 
least  one  of  the  estimates   f^jU.    is  consistently 
violated  by  the  iterates,  since  Algorithm  2  allows 
only  a  single  unit  change  beyond  the  estimated 
bounds  at  each  iteration. 


Algorithm  4,  to  be  described  below  (see 
also  Figure  1)  avoids  these  potential  problems, 
and  also  makes  use  of  the  possibility  of 
a  lower  bound  on  the  optimal  solution  of  (1.1).  To 
set  up  the  linear  program  for  determining  a  lower 
bound,  let  x'"^^   denote  the  optimal  solution  of  the 
most  recent  LP  solved,  and  let  Rl    be  a  set  such 

( k)      ^  (k) 
that,  for  each   i  ,  either  x      e  Rl   or  x     -1  e  R'  , 

1          1  i  i 

and  such  that  j  e  R|    implies   (j+1)  e  B-   (while  any 

set  of  breakpoints  with  these  properties  will  be  suit- 
able for  lower  bound  generation,  larger  R|  will 
yield  larger  LP's  and,  in  general,  better  lower 
bounds). 

A  lower  bound  on  the  optimal  value  of  (1.1) 
may  then  be  obtained  by  solving  the  following  LP, 
since  the  objective  function  of  the  LP  is  no  greater 
than  the  objective  function  of  (1.1)  on  the  feasible 
set  Q: 


min 


n 
V 


y. 


x,y,6  i=l 


s,  t.     Ax  =  b,  X  >  0 


(3.1) 


y.  >  f.(j)  +  5,  .(f,(j+l)-f,(j)) 
1-1  i,J   1  1 


X. 

1 


=   j  +  5. 


(j  €  R^,  i  =  1,.  .  .  ,n)  . 

(Of  course,  the  lower  bound  generated  by  solving 
(3.1)  may  turn  out  to  be       ,  in  which  case  it  would 
not  be  useful.    In  many  cases,  however,  additional 
information  such  as  non -negativity  can  be  included 
in  the  lower-bounding  LP  in  order  to  prevent  un- 
boundedness.    For  example,  if  the  functions  fi(x.) 
are  known  to  be  non-negative  for  non-negative 
(in  many  applications  the   f^  are  exponentials  or 
posynomials),  then  the  additional  constraints  yi>0. 
when  added  to  the  constraints  of  (3.1)  will  prevent 
unboundedness  of  the  objective  function.  )  Since 
Algorithms  2  and  3  (and  Algorithm  4  below)  generate 
feasible  solutions  to  (1.1),  it  is  possible  to  use  a 
lower  bound  on  the  optimal  value  of  (1.1)  in  a  termi- 
nation criterion  if  one  sets  an  optimality  tolerance 
on  the  gap  between  the  lower  bound  and  the  value  o 
the  best  feasible  solution.    (In  general,  the  feasibl 
solution  with  the  best  objective  value  will  be  the 
last  iterate,  but  an  exception  to  this  might  occur  if 
an  optimal  solution   (x,  y,  6_)   to  (3.1)  had  the  prop- 
erty that  X   was  integer.    In  general,  a  solution  of 
(3.1)  will  not  have  this  integrality  property,  but  if  it 
does,  then  x  will  be  a  feasible  solution  of  (1.1), 
and  may  have  an  objective  function  value  better  than 
that  of  the  last  iterate,  in  which  case  this  iterate 
should^be  replaced  by  x  .    Note  that  if  x  is  intege; 

and 


I 


Yj  f.(x)    coincides  with  the  optimal  value  of 

(3.1),  then  x  solves  (1.1).    See  Figure  1  forthede| 
tails  of  how  these  possibilities  may  be  taken  into 
account. ) 
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/  Initialize  the  r!°' 
set  k  =  0 


Yes 


"^1.  1)  infeasible  ^ 


No 


( k)     (  k)  ( k) 

Compute  an  optimal  extreme  point  (x     ,X     )  of  P 


and  set  z   to  the  optimal  value  of  P 


optimal  for  (1.  1) 


No 


Revise 

bound 

estimates 


Yes 


Special 
update 

for  R<^' 


Mo 


Increase 
k   by  1 


for  R* 


Yes/^  ( k-1) 

~  X        optimal  for 


(I-i) 


9 


(k) 

H  X     E -optimal  for] 

(1-1)  y 


Figure  1.    Flowchart  for  Algorithm  4 
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The  statement  of  Algorithm  4  assumes  that 
the  following  parameters  have  been  supplied:  (a)  a 
bound  on  the  total  number  of        j   variables  that 
will    be    allowed    in    any    LP  ,    ( b)  incre- 
ments a^jP^   to  be  used  for  revising  lower  and  up- 
per bound  estimates  (these  are  not  used  if  rigorous 
lower  and  upper  bounds  are  known),  (c)   an  optimal- 
ity  tolerance  e   and  a  parameter  establishing  the 
frequency  of  the  lower  bound  computations  (this 
could  be  based  on  clock  time  or  number  of  itera- 
tions). 

Algorithm  4  (Multiple  LP's,  bounded  stor- 
age, optimality  tolerance) 

Proceed  as  in  Algorithm  3,  except  that: 

( k) 

(a)  when  the  update  of  the  would  violate 
the  bound  on  the  number  of  \. ; ,  remove  from  the 
breakpoint  sets   Rj      prior  to  the  update, 
all  indices  other  than  those  corresponding 
to  the  current  values  of  the  upper  and  lower 
bound  estimates; 

(b)  when  an  iterate  violates  the  current  value 

{ !   of  the  lower  bound  estimate,    f  I    is  up- 
dated to  max{0,  f  1  -     },  and  when  an  iter- 
ate violates  the  current  value  u[  of  the 
upper  bound  estimate,    u!    is  updated  to 

u!  +  p.  ;  ^ 
1  1 

(c)  a  lower  bound  is  periodically  computed  by 
solving  a  problem  of  the  form  (3.1),  and  the 
algorithm  is  terminated  if  the  objective  func- 
tion value  of  the  best  feasible  solution  ob- 
tained thus  far  lies  within  e   of  the  lower 
bound  (if  the  t^*^  proble^m  of  the  form  (3. 1) 

has  an  optimal  value   z   >  -«>  ^  the  con- 
n  J.  - 

straint   J)  y.  >  z.   may  be  added  to  the 

i=l  ^ 

(t+l)st  problem  of  the  form  (3.1)  in  order  to 
guarantee  monotonicity  of  the  lower  bounds; 
note  that  the  dual  simplex  algorithm  may  be 
used  with  the  optimal  solution  from  the  t^'^ 
problem  serving  as  the  initial  solution  of 
the  dual  of  the   (t+l)st  problem).  ■ 

For  notational  convenience  in  the  flowchart 
n 

for  Algorithm  4,    Yj  ^j^'^^'       denoted  by  f(x)  and 
P(rJ^',.  .  .  ,r[^^')   is  denoted  by  P^^'. 

4.    Numerical  Example 

In  this  section  we  present  a  numerical  ex- 
ample to  illustrate  the  algorithmic  ideas  introduced 
in  the  previous  sections. 

The  problem  dealt  with  has  the  form 

15  X. 
min    V  w.(l-q.)  ^ 
i=l    ^  ^ 
10  15 

(4.1)  s.t.  l^,-^,, 

1=1  1=6 
0  <  X  <  u,    X  integer 


5le 

where  the  data  and  optimal  solution,    x     are  given 
in  Table  1.    Using  the  "complete  grid"  approach  of 
Algorithm  1,  the  resulting  LP  has  17  equations  and 
262  variables,  and  the  use  of  a  small,  locally- 
written  LP  package  yielded  the  optimal  solution  x 
of  Table  1. 

By  contrast,  a  column -generation  procedure 
of  the  type  described  in  Algorithm  2  started  with  hO 
variables^and  terminated  with  an  optimal  solution 
(x''  of  Table  1)  in  7  interations,  the  final  LP  solved 
having  131  variables. 


i 

w. 
1 

^1 

u. 
1 

x* 
1 

1 

9.  2 

0.  31 

16 

12 

2 

1 .  0 

0.  45 

16 

4 

3 

7.  6 

0.  23 

19 

14 

4 

0.6 

0.  09 

10 

2 

5 

8.  8 

0.  15 

10 

10 

^ 
o 

4 .  c. 

0.21 

1 1 

1  1 

7 

3.  2 

0.  15 

17 

12 

8 

3.4 

0.  01 

20 

0 

9 

8.  8 

0.  79 

16 

3 

10 

6.  6 

0.  41 

15 

7 

11 

1.  2 

0.7  1 

17 

3 

12 

4.  6 

0.  77 

12 

4 

13 

0.  8 

0.  79 

13 

2 

14 

3.0 

0.  21 

20 

13 

15 

1.  2 

0.  07 

20 

12 

75,  b^  = 

67,  optimal  value  =  7.7  586 

Table  1.    Data  for  Numerical  Example 
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EXTREME  POINT  RANKING  ALGORITHMS: 
A  COMPUTATIONAL  SURVEY 


Patrick  G.  McKeown 
The  University  of  Georgia 


ABSTRACT 

Since  it  has  been  long  known  that  the  optimal  so- 
lution to  the  minimization  of  a  concave  objective 
function  over  a  convex  set  will  occur  at  an  ex- 
treme point  of  the  convex  set,  one  method  of  solv- 
ing this  type  of  problem  is  to  rank  these  extreme 
points.     In  the  case  where  the  objective  function 
in  nonlinear  and  the  constraint  set  is  linear, 
there  have  been  numerous  articles ,  more  conceptual 
than  computational,  on  the  application  of  an  ex- 
treme point  ranking  algorithm  as  a  solution  proce- 
dure.    In  this  paper,  we  will  review  the  useful- 
ness of  this  general  type  of  procedure  to  various 
problems  by  attempting  to  combine  the  available 
computational  literature  with  computational  re- 
sults of  the  author  that  have  not  been  previously 
presented. 

I.  Introduction 

This  paper  will  be  concerned  with  problems  of  the 
following  general  type: 

min  f (x) 

s.t.  xeS  (P) 

where  S  =  {x |  Ax=b , x>_0}  , 

and  f (x)   is  concave.     A  is  assumed  to  be 
mxn,  bismxl,  and  x  is  n  x  1. 

It  was  shown  by  Hirsch  and  Hoffman   [12]   that  an 
optimal  solution  to  P  would  occur  at  an  extreme 
point  or  vertex  of  S.     Hence,  to  find  an  optimal 
solution  to  P  it  is  "only"  necessary  to  search  the 
vertices  of  P  until  an  optimal  solution  is  found 
and  proved.     If  f (x)   is  linear,  i.e., 

n 

f{x)   =     I  C.x. 

j=l  '  ^ 

then  the  well-known  simplex  method  is  a  very  effi- 
cient procedure  for  carrying  out  this  search.  How- 
ever, if  f (x)   is  nonlinear,  say,  quadratic  or  in- 
teger, there  does  not  exist  a  simplex  type  of  al- 
gorithm.    It  is  the  nonlinear  case  that  we  shall 
be  concerned  with  here. 

Since  no  "direct"  optimization  techniques  exist 
for  the  case  where  t (x)   is  nonlinear,  we  shall 
look  at  two  approaches  to  searching  the  extreme 
points  of  S.     First,  one  might  wish  to  use  a  lin- 
ear under  approximation  of  f{x),  say  L(x),  such 
that  L(x)£f(x)  xeS.     In  this  case,  to  show  that  x* 
is  an  optimal  solution  to  P,  we  need  only  rank  the 
vertices  of  S  until  the  vertex  x°  is  found  such 


that  L (x° )>_f  (x*) .     At  this  point,  all  vertices 
that  could  possibly  be  optimal  have  been  ranked. 
This  is  proved  by  Cabot  and  Francix   [3] . 

The  second  case  applies  to  the  case  where  f (x)  is 
a  sum  of  a  linear  and  a  nonlinear  portion,  i.e., 
n 

f (x)   =     Z  C.x. ,  +  g(x) 

j  =  l  ^  ^ 
where  g{x)   >^  0  for  x  >^  0 
In  this  situation,  one  may  seek  to  find  a  lower 
bound  on  g(x) ,  say  G.     Then  we  may  use  the  linear 
portion  of  f (x)  plus  the  lower  bound  as  an  under 
approximation  of  f (x) .     Obviously,  for  x  >_  0, 
f(x)   >^  C  x  +  G.     Hence,  if  x*  is  optimal  for  P, 
then  we  need  only  rank  the  vertices  of  S  until  a 
point  x°  is  found  such  that  c'^x°  +  G  >^  f  (x*)  .  This 
proves  that  x*  is  optimal.     The  best  example  of 
this  is  the  fixed  charge  problem. 

In  order  to  rank  the  extreme  points  of  S,  we  need 
to  use  in  both  cases  above  a  result  also  first 
proved  by  Murty   [20]   as  Theorem  1  below: 

Theorem  1:     If  E^,  E  ,  E^  are  the  first  K  ver- 

tices of  a  linear  under  approximation  problem 
which  are  ranked  in  nondecreasing  order  according 
to  their  objective  value,  then  vertex  E^_^^  must  be 

adjacent  to  one  of  E  ,  E  E  . 

12  K 

Simply  put,  this  says  that  vertex  2  will  be  adja- 
cent to  the  optimal  solution  to  the  linear  under 
approximation  and  vertex  3  will  be  adjacent  to  ver- 
tex 1  or  vertex  2.     This,  then,  gives  us  a  proce- 
dure for  ranking  the  vertices  if  all  adjacent  ver- 
tices can  be  found.     It  is  this  "if"  that  quite 
possibly  has  accounted  for  the  few  number  of  com- 
putational papers  relative  to  the  number  of  con- 
ceptual works.     This  comes  about  due  to  the  possi-" 
bility  of  degeneracy  in  S.     If  S  is  degenerate, 
then  there  may  exist  multiple  bases  for  the  same 
vertex.     This  implies  that  all  such  bases  must  be  ' 
available  before  one  can  be  sure  that  all  adjacent 
vertices  have  been  found.     Finding  all  such  bases 
for  finding  and  "scanning"  all  adjacent  vertices 
can  be  quite  cumbersome.     As  we  shall  see  later,  aj, 
recent  application  of  Chernikova's  work   [5]  has 
been  shown  to  be  a  way  around  the  problem  of  de- 
generacy.    This  will  be  discussed  in  more  detail 
when  the  fixed  charge  problem  is  explored. 

The  literature  has  been  found  to  refer  to  basicallji  . 
four  types  of  problems: 

(i)       fixed  charge  problems. 
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(ii)  traveling  salesman, 

(iii)  quadratic  assignment,  and 

(iv)  concave  quadratic  programming  problems. 

'e  shall  briefly  discuss  the  conceptual  background 
•f  each  problem  and  then  present  and  discuss  the 
vailable  computational  applications  of  the  ex- 
reme  point  ranking  procedure  to  each  problem, 
here  appropriate,  we  will  present  previously  un- 
ublished  work  of  the  author  on  the  problem  at 
and.     Finally,  we  will  attempt  to  draw  conclu- 
ions  about  the  efficiency  of  this  type  of  proce- 
ure . 

11.     The  Fixed  Charge  Problem 

[ 

jne  of  the  first  types  of  problems  recognized  as 
jeing  of  the  form  specified  as  P  previously  is  the 
inear  fixed  charge  problem.     This  problem  is  for- 
ulated  as  P^  below: 


Min 


c  X  +  f  y 


S.t.  xeS 


(P. 


Cl  if  X.  >  0 

■j  if        =  0 


n  this  case,  f  and  y  are  nxl  and  all  other  dimen- 
ions  are  as  heiore.  The  f    values  are  the  fixed 
r  "set-up"  costs  while  the  c  values  are  the  con- 
inuous  costs.     Hirsch  and  Dantzig   [10,   11]  first 
ormulated        and  recognized  that  an  optimal  so- 
ution  of  Pp  would  occur  at  an  extreme  point  of  s 
f  the  f's  are  non-negative. 

special  subset  of  the  fixed  charge  problem  is 
he  fixed  charge  transportation  problem  (FCTP) 
here  the  A  matrix  takes  on  the  form  of  the  Hitch- 
ock  Transportation  Problem  constraint  set,  i.e., 
M 

j  =  1,...,N 


and 


I  X.  . 
i=l 
N 

I  X.  . 

j=l 
M 

I  s.  = 


i  =  1, 


,M 


N 

I  d. 


i=l  j=l 

ere,  M  =  number  of  supply  points  and  N  =  number 

f  demand  points.     Balinski   [2]   showed  that  a 

inear    under  approximation  of  the  objective 

unction  of  the  FCTP  could  be  found  by  first 

sttinq  u, .  =  Min  {s.,  d.}  and  then  setting 
1  ID       „      „  1  3 


M 
I 


(C. 


i=l  j=l 


11 


f .  ./u.  .) 
11  11 


11 


jlthough  Hirsch  and  Dantzig  had  formulated  the 
■-xed  charge  problem  and  Balinski  had  found  ap- 
roximate  solutions  earlier,  Murty   [20]  first 
iggested  the  use  of  extreme  point  ranking  algo- 
ithm  for  solving  this  class  of  problems.  He 
[lowed  that  if  a  lower  bound  on  the  sum  of  fixed 
larges  could  be  found,  say  F  ,  then  the  optimal 
3lution  to  Pp,    (x*,y*),  could  be  found  by  ranking 
le  vertices  of  the  corresponding  continuous  prob- 
2m  (all  f .  =  0)   until  a  point  x     is  found  such 
lat  C  X  +-'Fq  =^  C  x*  +  f  y.     The  values  of  {x*,y*) 
ay  be  found  by  checking  each  extr;^me  po^nt  to  de- 
annine  whether  a  lower  value  of  C  x  +  f  y  existed. 


When  the  point  x    is  found,  the  solution  (x*,y*) 
is  optimal.     Murty  did  not  discuss  any  computa- 
tional results  and  left  several  unresolved  prob- 
lems with  the  solution  procedure. 

The  first  unresolved  problem  was  to  find  F^^.  Murty 
suggested  that  the  m  smallest  fixed  charges  be 
summed.     This  may  be  easily  seen  to  be  inadequate 
for  problems  with  greater-than  constraints  or  for 
degenerate  problems.     This  method  also  does  not 
attempt  to  find  a  lower  bound  that  is  feasible. 

Secondly,  Murty  did  not  discuss  an  adequate  method 
of  handling  degeneracy  in  finding  adjacent  extreme 
points  other  than  determining  all  bases  and  apply- 
ing the  simplex  method  to  each  one  in  turn. 

Although  Murty  did  not  present  computational  re- 
sults of  using  his  extreme  point  ranking  method,  he 
did  hypothesize  that  it  would  probably  be  more  ef- 
ficient in  those  cases  where  the  continuous  por- 

T  T 
tion  (C  x)  dominated  the  fixed  portion  (f  y) .  This 

was  later  shown  by  McKeown  in  his  dissertation  [16] 
and  in  a  later  article   [17] .     He  effected  an  imple- 
mentation of  Murty 's  procedure  by  resolving  the 
two  problems  mentioned  earlier.     He  showed  that  a 
lower  bound  on  the  fixed  charges,  F^,  could  be 
found  by  solving  by  linear  programming  the  set- 
covering  problem  P  below. 

p 

n 

Min  F, 


■  0 


S.t. 


T.  f  .y  . 

i=l  '  ' 
n 

I  6.  .y.  > 

j=l        3  - 


>  0, 


(Po) 


where  6 


and 


_  fl  if  a.  .  >  0 
ij  "  Jo  if  a^^  <  0 


g .  =  1   (except  for  the  FCTP  where  6  . 
values  greater  than  1  were  use3 
to  account  for  the  number  of  cells 
in  a  row  necessary  to  possibly 
achieve  feasibility) . 
This  method  was  found  to  yield  better  (larger)  val- 
ues of  f^  than  that  originally  suggested  by  Murty 
while  also  handling  the  problems  with  greater-than 
constraints  and  degeneracy  mentioned  earlier. 

To  handle  the  problem  caused  by  degeneracy  of  S  in 
finding  adjacent  vertices,  a  modification  of  work 
by  Chernikova   [5]  was  used  to  determine  adjacent 
vertices  of  a  degenerate  vertex   [25] . 

Computational  experience  with  this  implementation 
of  Murty 's  suggestion  bore  out  the  original  con- 
jecture that  ease  of  solution  would  largely  be  a 
function  of  the  relative  size  of  the  fixed  and  con- 
tinuous portions  of  the  objective  function.  Two 
types  of  problems  were  tested.     First,  a  group  of 
general  linear  fixed  charge  problems  generated  by 
Steinberg   [26]  were  run  using  a  FORTRAN  code  on  an 
IBM  360/175.     These  were  5x10  problems  with  equal- 
to  constraints  which  were  randomly  generated  such 
that  0  <^  C.  <_  20  and  0  <_  d .  £  999.     Five  of  these 
problems  w3re  tested  and  solved  regardless  of  re- 
lative costs. 

In  the  second  case,  a  set  of  nine  fixed  charge 
transportation  problems  originally  generated  by 
Gray   [9]  were  tested.     In  this  case,  these  problems 
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ranged  from  M=3  and  N=4  to  M=6  and  N=8.     Three  of 
these  problems  had  a  relatively  large  continuous 
portion  while  the  remaining  six  were  fixed  charge 
dominant.     In  the  former  case,  the  procedure  was 
quite  efficient  regardless  of  problem  size  while 
in  the  latter  case,  the  algorithm  proved  to  be  in- 
efficient in  that  none  of  the  six  could  be  solved 
due  to  storage  overruns. 

In  the  later  article   [17] ,  McKeown  expanded  the 
computational  experience  by  using  more  variations 
of  Steinberg's  problems,  i.e.,   10x20 's  and  prob- 
lems with  greater-than  constraints.     In  these 
problems,  results  were  encouraging  as  long  as  the 
relative  cost  sizes  were  favorable.     For  the  prob- 
lems with  equal-to  constraints,  the  fixed  costs 
dominated  due  to  M  structural  variables  always 
being  basic.     However,  in  the  problems  with 
greater-than  constraints,  the  reverse  was  true  and 
the  results  showed  that  extreme  point  ranking  was 
much  more  efficient  for  this  latter  class  of  prob- 
lems.    He  also  tested  for  the  effect  of  degeneracy 
on  FCTP's  by  comparing  degenerate  versions  of 
originally  non-degenerate  problems.     The  results 
appeared  to  show  that  degeneracy  is  not  a  problem 
as  long  as  the  relative  costs  are  favorable. 

In  summary,  Murty's  extreme  point  ranking  proce- 
dure does  appear  to  be  efficient  in  those  cases 
where  the  fixed  cost  portion  is  small  compared  to 
the  continuous  portion. 

Two  articles  have  appeared  that  use  cutting  plane 
variations  of  Murty's  procedure.     Cabot   [4]  sug- 
gested the  use  of  Tui   [29]   cuts.     The  original  Tui 
algorithm  involved  determining  a  local  minimum, 
say  x,  over  s.     Then  a  hyperplane  is  passed 
through  the  convex  polytope  in  such  a  way  that  all 
extreme  points  of  s  with  value  greater  than  the 
local  minimum,  x,  are  excluded.     These  are  com- 
bined with  the  linear  under  approximation  for  fix- 
ed charge  transportation  problems, 
N  N 


L    (x)    =     Z       I    (C.  .   +  f .  ./u.  .) 
^  j=l  i=l  ^3 


He  used  two  problems  generated  by  Gray  for  his 
testing.     Both  problems  had  M=4  and  N=6.  Problem 

1  was  continuous  cost  dominant  while  Problem  2  was 
fixed  cost  dominant.     Before  using  the  Tui  cuts, 
Cabot  attempted  to  solve  both  problems  via  extreme 
point  ranking  using  L  (x) .     As  would  be  expected, 
he  quite  easily  solved  Problem  1  but  was  unable  to 
prove  optimality  for  Problem  2  after  ranking  over 
400  extreme  points.     He  then  used  a  combined  Tui 
Cut  -  extreme  point  ranking  procedure  to  solve 
both  problems  with  equal  efficiency.     This  insen- 
sitivity  to  relative  costs  became  more  apparent 
when  he  devised  30  test  problems  by  randomly  gene- 
rating new  objective  functions  for  Problems  1  and 

2  where  the  ratio  of  fixed  costs  to  continuous 
cost  ranged  from  5  to  200.     For  these  problems 
his  procedure  appeared  to  be  equally  efficienct 
regardless  of  the  ratio  of  fixed  to  continuous 
costs  in  that  he  solved  23  of  the  30  test  problems. 
He  handled  degeneracy  by  using  a  pertubation 
scheme,  which  while  resolving  degeneracy,  intro- 
duced numerous  additional  extreme  points. 

In  another  paper,  Taha   [28]   combined  extreme  point 
ranking  with  "Glover   [8]   cuts"  to  solve  general 
linear  fixed  charge  problems.     He  defines  a  Glover 


Cut  to  be  one  which  separates  a  given  extreme  point 
from  the  convex  polytope.     He  used  as  a  linear 
under  approximation  the  continuous  portion  of  ob- 
jective function,  i.e., 
n 


L  (X) 

g 


I  C  .X. 


and  used  these  Glover  Cuts  to  reduce  the  number  of 
extreme  points  to  be  ranked.     In  addition,  he  used 
a  technique  suggested  by  Balas   [1]  of  dropping  con- 
straints associated  with  the  degenerate  basic  vari- 
able to  redefine  the  polytope.     It  appears  that 
this  was  done  to  insure  that  the  Glover  Cut  was  de- 
fined rather  than  to  find  adjacent  extreme  points. 
This  procedure  also  tended  to  generate  additional 
extreme  points  to  be  considered.     Taha  solved  ran- 
domly generated  problems  of  size  as  large  as  15x20 

with  0  <  C .  <  800  and  0  <  F .  <  100  in  an  average  of 

—     3  —  ~     1  ~ 

47  seconds  on  an  IBM  360/50.     As  may  be  noted, 

these  problems  appear  to  be  continuous  cost  domi- 
nated.    This  conjecture  seems  to  be  borne  out  by 
the  second  set  of  test  problems  where  for  0  <^  F .  <_ 
300,  we  find  that  the  solution  times  have  gone  ip 
by  a  factor  of  four.     Since  these  problems  were 
randomly  generated,  it  is  highly  unlikely  that  any 
were  degenerate  so  we  do  not  know  the  possible 
effect  of  degeneracy  on  the  solution  procedure. 

One  point  that  Taha  makes  is  that  the  use  of  a  lin- 
ear under  approximation  for  general  problems  simi- 
lar to  that  suggested  by  Balinski   [2],  i.e., 
n 

x . 


L  (x) 

g 


I  (c.  +  F./u.; 
j=i   ^      3  3 


where  u .  >  x .       V , , 
3  ~    3  3 

is  made  difficult  by  the  need  to  solve  a  family  of 

linear  programming  problems  to  find  the  u  's. 

m 

Recently,  the  author  investigated  further  the  use 
of  L^(x)   as  a  linear  under  approximation  for  rankinc 
extreme  points  for  the  fixed  charge  transportation 
problem  as  suggested  by  Cabot   [4] .     In  this  work, 
he  used  a  ranking  procedure  specifically  developed 
for  transportation  polytopes   [19] .     The  nine  prob- 
lems developed  by  Gray   [9]  were  used  as  bench  marks 
for  comparison  with  other  procedures.     In  Table  1 
below,  we  show  the  results  from  this  computational 
testing.     In  this    we  have  shown,  for  each  problem  ■■ 
the  size  (MxN) ,  the  relative  sizes  of  the  fixed  andj 
continuous  portions  of  optimality  (F*/C*) ,  the  num- 
ber of  extreme  points  ranked  to  prove  optimality, 
the  solution  time  on  the  UNIVAC  1110,  the  solution 
time  for  Kennington's   [13]  branch-and-bound  proce- 
dure on  the  CDC  Cyber  70,  and  Gray's  original  times 
on  the  Burroughs  5500. 

As  we  can  see ,  only  on  problems  1 ,  3 ,  and  8  are  oui 
times  competitive  with  those  of  Kennington  or  Gray., 
These  problems  are  the  ones  with  large  continuous 
portions,  and  this  is  to  be  expected.  This  proce- 
dure does  seem  to  be  better  than  the  original  rank-; 
ing  algorithm  suggested  by  Murty  in  that  at  least 
we  were  able  to  solve  all  of  the  problems,  while 
McKeown  could  only  solve  1,  3,  and  8  using  Murty's 
algorithm  [20] . 

Table  2  below  compares  extreme  point  ranking  re- 
sults using  the  original  Murty  algorithm  and  the 
Balinski  Approximation.     Both  of  these  were  run  by 
the  author  using  the  previously  discussed  ranking 
procedure  and  are  for  the  same  nine  problems  in 
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TABLE  1 


COMPUTATIONAL  RESULTS  USING  BALINSKI ' S  APPROXIMATION 
TO  SOLVE  FCTP ' S 


Problem 


Size 
(MxN) 


F*/C* 


Number 
Ranked 


1110 
Time 


5500 
Time 


Cyber 
72  Time 


3x4 
4x6 
4x6 
4x8 
5x7 
5x7 
5x7 
5x7 
6x8 


59/270 
95/108 
52/1947 
108/164 
117/126 
173/143 
1472/166 
59/2230 
120/194 


3 

732 
2 

756 
*** 

154' 

*  ** 

2 

*** 


0.09 
53.  36 

0.15 
124.0 

18.70 

.20 


7. 
32. 
26. 
171. 
263.8 
149.9 
97.0 
3262.8 
1510.1 


.08 
1.  35 
.08 


10 
93 


0.99 


48 
11 


14.52 


(All  time  in  seconds) 

***  -  procedure  aborted  due  to  storage  overflow 


TABLE  2 

COMPARISON  OF  MURTY ' S  APPROACH  TO 
BALINSKI ' S  APPROXIMATION 

Number  of  Extreme  Points  Ranked 


Gray 

Size 

Problem 

MxN 

Murty 

Balinski 

1 

3x4 

29 

3 

2 

4x6 

*** 

732 

3 

4x6 

2 

2 

4 

4x8 

*** 

756 

5 

5x7 

*** 

*** 

6 

5x7 

*** 

154 

7 

5x7 

*  ** 

*** 

8 

5x7 

9 

2 

9 

6x8 

*** 

*** 

*** 

-  stopage 

overflow 

In  this  we  note  that  the  Balinski  Approximation  is 
quite  a  bit  more  efficient  than  the  Murty  approach 
both  in  number  of  points  ranked  for  those  problems 
that  were  solved,  as  well  as  for  number  of  prob- 
lems actually  solved. 

III.     Concave  Quadratic  Programming  Problems 
If  we  define 


T 
C  X 


X  D 


fg(x) 
s.t.  xeS 

■then  we  may  say  that  the  optimal  solution  to  P 
will  occur  at  an  extreme  point  of  S  under  any  one 
-of  the  following  conditions  as  enumerated  by  Cabot 
and  Francis   [3] : 

1.  The  matrix  D  is  negative  definite  or 
negative  semidef inite . 

2.  Problem  P  is  a  quadratic  programming 
formulation  of  a  problem  occurring  in 
bimatrix  games. 

3.  Problem  P     is  a  quadratic  assignment 
problem. 

These  were  not  meant  to  be  inclusive  of  all  con- 
ditions under  which  the  optimal  solution  of  P 


occurred  at  an  extreme  point,  but  rather,  were  ex- 
amples of  such  conditions.     We  will  consider  con- 
dition 3  in  a  subsequent  section, and  there  does 
not  seem  to  have  been  any  computational  experience 
in  solving  bimatrix  games  via  extreme  point  rank- 
ing.    As  a  result  of  this,  we  will  restrict  our 
attention  to  problems  fitting  condition  1,  i.e., 
D  negative  semi-definite  or  negative  definite. 

For  this  case,  Cabot  and  Francis  show  that  if  we 

solve  the  family  of  linear  problems,  P"' ,  below  for 

u 

each  column  of  D,  d. ,  then  the  minimum  value  of 
the  objective  function,  u,,  can  be  used  to  deter- 
mine a  linear  under  approximation,  L(x). 

T  i 
Min    u.  =  d.x  (P-^) 
3        3  u 
S.t.   x  S 

Now,  using  u.,  we  write  P      as  the  linear  under 
approximation  problem  as : 

n 

Min     L(x)  = 


I  (C.  +  u.)  X. 
j=l     3  ^  ^ 


S.t.  xeS. 


A  major  drawback  in  using  this  approach  is  the 
necessity  of  computing  the  u.  values.     One  special 
case  where  this  is  not  a  profilem  is  the  quadratic 
transportation  problem  P       as  formulated  below: 
M  N 


Min 


S.t. 


E  Z 
i=l  j=l 
M 
I 


(C,  .X.  . 
ID   ID  ■ 


i=l 
N 

I  X 
j  =  l 


ID 


2 

d.  .X  .) 
ID  ID 


j=l,  N 


ID 


i=l,  . 


,M 


x^_^  >^  0  for  all  i,j 

and     d. ,   <  0. 

ID  - 

This  problem  was  formulated  as  a  way  of  modeling 
the  marginally  decreasing  cost  characteristics  of 
the  private  motor  freight  industry   [24] .     As  Cabot 
and  Francis  note,  u      =  d     [min(B.,A.)]  and  we  may 
easily  formulate  thl'^ob j eB?ive  function  for  (P^ 
for  the  quadratic  transportation  problem. 


■QL 
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It  should  be  noted  that  in  the  case  of  where  one  is 
attempting  to  model  the  decreasing  marginal  cost 


characteristics  over  various  routes  by  use  of  P 


QT' 


it  would  be  expected  that  the  quadratic  portion 
will  be  small  relative  to  the  continuous  gortion. 
In  fact,  we  would  expect  that  C.x.  +  d. .x. .  would 
be  positive  non-decreasing  over'' tile  entlre-'range 
of  values  of  x..,  orC.  >_  |2d..u..|   for  each  i , j 
We  would  also  eJpect  tMt  the  effectiveness  of 
(P     )   for  solving  the  problem  by  extreme  point 
ra8feing  would  increase  as  C . .  increases  relative 


to  d ,  .  U .  .  , 
ID  ID 


ID 


To  test  this,  we  used  two  sets  of  five  randomly 
generated  quadratic  transportation  problems  rang- 
ing in  size  from  3x4  to  5x7.     In  each  case  we  used 
the  Chernikova  algorithm  as  modified  by  McKeown 
and  Rubin   [19]   to  handle  degenerate  problems.  The 
continuous  cost  ranges  are  shown  below: 


Case  I : 
Case  II: 


2\i  .  .d.  .  <  c.  .  <  5000 
ID  ID        ID  - 


2\i.  .d.  .  +  500  <  c.  .  <  9999 
ID   ID  ID  - 

In  all  problem  sets,  the  quadratic  costs  were  be- 
tween 0  and  10.     In  problem  5,  these  costs  were 
between  0  and  20.     The  results  of  this  testing  are 
shown  in  Table  3. 

From  these  results,  we  note  some  differences  in 
run  time  as  we  move  from  Case  I  problems  to  Case 
II.     This  is  felt  to  be  due  to  the  increasing  dom- 
inance of  continuous  costs  over  quadratic  costs  in 
the  latter  case.     We  also  note  an  increase  in  both 
parameters  as  we  move  to  large  problems  for  Case 
I.     This  is  to  be  expected  due  to  increasing  prob- 
lem size.     This  does  not  happen  in  Case  II  since 
very  few  solutions  need  be  ranked  in  this  case. 

Problem  Set  5  is  the  same  as  Problem  Set  2  except 
that  the  range  of  quadratic  costs  has  been  in- 
creased.    Here  we  note  a  great  increase  in  solu- 
tion parameter  values  in  Set  5  over  Set  2  for  Case 
I .     This  is  due  to  the  decrease  in  dominance  of 
continuous  costs  over  quadratic  costs. 

TABLE  3 

SOLUTION  TIMES  AND  NUMBER  OF  EXTREME  POINTS 
RANKED  FOR  VARIOUS  CONTINUOUS  COST  RANGES 


Problem 

Size 

d.  . 

ID 

Case  I 

Case  I 

Set 

(MxN) 

Range 

Time 

Time 

1 

3x4 

0-10 

.09 

.08 

2 

4x6 

0-10 

.34 

.31 

3 

4x8 

0-10 

.86 

.84 

4 

5x7 

0-10 

1.78 

.85 

5 

4x6 

0-20 

1.13 

.41 

In  terms  of  results,  these  problems  tend  to  be 
much  more  promising  than  any  of  the  others  dis- 
cussed in  this  paper.     If  further  research  demon- 
strates that,  for  problems  of  the  nature  discussed 
by  Oi  and  Hurter,  the  costs  are  indeed  marginally 
decreasing  with  dominant  continuous  costs,  then 
this  ranking  procedure  may  be  useful  for  solving 
such  transportation  problems. 

IV.     The  Quadratic  Assignment  Problem 

One  area  of  mathematical  programming  where  it  also 


has  long  been  known  that  the  solution  would  occur 
at  an  extreme  point  is  that  of  variations  of  the 
assignment  problem.     One  such  variation,  the  well 
known  traveling  salesman  problem  will  be  discussed 
in  a  later  section.     Here  we  will  discuss  the  as- 
signment problem  where  there  are  interactions  be- 
tween the  assignments   [7] .     This  is  commonly  re- 
ferred to  as  the  quadratic  assignment  problem  and 


may  be  formulated  as  P 


Min 


S.t. 


n  n 

I       I  C 
i=l  j=l 
n 

E  X 
i=l 
n 

I  X 
K=l 


AQ 


X .  . 

ID  ID 


below: 
n  n  n  n 

Z  Z  Z  I  K 
j 


1  D 


ik 


ik 


=  1,. 


i=l, 


<  n , 


iDpq^'ij^pq 


X.,    >  0, 

ik  — 

where  K. .  is  the  cost  of  an  assignment  of  i  to  ji 
and  p  to  q? 

That  the  solution  of  P  would  occur  at  an  extreme 
point  of  the  constraint  set  was  first  noted  by 
Gilmore  [7] .  A  linear  under  approximation  to  P 
was  suggested  by  Lawler  [14]  and  may  be  formulated 
as  follows :^ .denote  for  each  i,  j,  a  minor  of 
K.  .       as  K  and  denote  the  value  of  the  solu- 

the  (n-l)x(n-l)  assignment  problem  corre 
sponding  to  each  minor  as  Z  .  We  also  define  a 
cost  f,.=Z'''-'  +C...  The  solution  to  the  as 
signmeni  problem  for''"ihe  matrix  F  =  {f^^} 
lower  bound  on  the  solution  to  P 
we  may  use  F,  the  assignment  proBIem  with  cost  cO' 
efficients  {f . .},  as  a  linear  under  approximation 
to  P  for  extreme  point  ranking  purposes.  This 
is  an  analogous  procedure  to  that  discussed  for 
finding  a  linear  approximation  to  quadratic  pro- 
gramming problems  with  concave  objective  function 
discussed  earlier   [3] . 


As"'a  result. 


In  some  problems,  we  may  define 

K. .       =  t.  d. 

iP  DP 


iDPq 

This  is  true  in  the  Koopmans-Beckman  single  com- 
modity problem. 

An  extreme  point  ranking  procedure  for  solving 
this  and  other  non-convex  quadratic  minimization 
problems  was  suggested  by  Cabot  and  Francis  [3] 
using  the  linear  under  approximation  F.     This  pro- 
cedure is  essentially  that  discussed  earlier  for 
this  type  of  linear  under  approximation  in  that 
extreme  points  of  S  are  ranked  until  a  value  x^  is 
found  such  that  L^(5{q)  ^  f(x*)  where  x*  is  the  op' 
timal  solution. 

Using  this  approach  to  the  quadratic  assignment 
problem,  Fluharty   [6]  wrote  a  master's  thesis  on 
the  problem.     In  that  work,  he  tested  various  prob 
lems  generated  by  previous  researchers  who  had 
worked  on  the  quadratic  assignment  problem.  Usinc 
a  procedure  suggested  by  Murty  [21]   for  determinii 
adjacent  extreme  points  for  assignment  problems, 
he  tested  problems  for  4  <^  n  <^  12  with  mixed  re- 
sults.    Only  in  those  cases  where  the  continuous 
values,  i.e.,  {C . . } ,  were  large  as  compared  to  th< 
quadratic  values"'"^as  he  consistently  assured  of 
not  having  to  enumerate  all  n!  extreme  points  to 
reach  a  solution.     For  a  problem  with  n  =  12 , 
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<  C.  .  _<  99,  and  0  _<  t .    ,d.     £  10,  the  procedure 
verraA  available  core  s¥orafe.     However,  when  the 
ax    C.  .     was  increased  to  999,  the  problem  was 
olved"'"iy  ranking  only  100  extreme  points.  In 


able  4, 

we  see  the 

results  of  Fluharty' 

s  worlc. 

TABLE  4 

SUMMARY 

OF  FLUHARTY 

S  RESULTS 

C  .  . 

t.  ,d. 

Solution 

ID 

ip  iq 

rob 1 em 

Size (N) 

Range 

Range 

Ranked 

1 

4 

(0,0) 

(0,20) 

20 

2 

5 

(0,0) 

(0,5) 

11 

3 

5 

(1,5) 

(0,5) 

37 

4 

6 

(0,0) 

(0,10) 

720 

5 

6 

(0,9) 

(0,10) 

202 

6 

6 

(0,99) 

(0,10) 

10 

7 

7 

(0,99) 

(0,9) 

135 

8 

7 

(0,0) 

(0,10) 

5040 

9 

7 

(0,99) 

(0,10) 

170 

10 

7 

(0,9) 

(0,10) 

1848 

11 

7 

(0,0) 

(0,99) 

3201 

12 

7 

(0,99) 

(0,99) 

3212 

13 

8 

(0,0) 

(0,10) 

*  *  * 

14 

8 

(0,9) 

(0,10) 

*  *  * 

15 

8 

(0,99) 

(0,10) 

-121 

15 

10 

(0,99) 

(0,9) 

17 

10 

(0,999) 

(0,9) 

188 

18 

12 

(0,0) 

(0,10) 

*** 

19 

12 

(0,99) 

(0,10) 

*  *  * 

20 

12 

(0,999) 

(0,10) 

100 

*** 

-storage  overflow- 

nee  again  the  importance  of  relative  size  of  lin- 
ar  vs.  nonlinear  portions  of  the  objective  func- 
ion  becomes  evident  for  the  use  of  extreme  point 
rocedures . 


Traveling  Salesman  Problem 

!ith  all  the  work  on  using  extreme  point  ranking 
ilgorithms  to  solve  various  problems  with  linear 
.'onstraints ,  it  might  be  surprising  to  find  that 
'ery  little  work  has  been  done  on  the  traveling 
ialesman  problem.     It  can  be  easily  seen  that  the 
iptimal  traveling  salesman  tour  can  be  found  by 
"anking  the  solutions  to  the  assignment  problem 
intil  the  tour  is  found.     However,  our  research 
las  shown  that  only  some  very  early  work  by  Murty 
ind  Karel   [23]  has  been  documented.  However, 
■.hey  opted  to  move  onto  their  branch  and  bound  al- 
j'orithm  [15]   instead  of  continuing  work  on  extreme 
joint  ranking  methods . 

in  this  early  paper,  a  method  for  ranking  the  as- 
signments was  presented  together  with  a  procedure 
.o  avoid  generating  adjacent  assignments  that  con- 
tained subtours.     Using  this  procedure,  a  ten  city 
andomly  generated  problem  was  solved  "...by  hand, 
iind  the  time  taken  was  about  half  an  hour."  Also, 
I  20  city  symmetric  problem  was  solved.  This 
'...involved  the  solving  of  10  different  assign- 
lent  problems  of  sizes  ranging  from  16  to  20.  On 
Jie  Burroughs  220  computer,...,  this  took  about  10 
iiinutes  in  all."    [20]     As  mentioned  earlier,  this 
Vork,  though  not  published,  resulted  in  a  branch 
md  bound  algorithm.     Also,  the  method  for  ranking 
:he  assignments  presented  in  this  paper  was  later 


published   [21] . 

One  other  bit  of  unpublished  work  in  this  area 
came  to  light  from  Sweeney  and  Williams   [27] .  It 
was  stated  by  them  that  for  a  problem  on  the  order 
of  40  cities,  approximately  5000  assignments  were 
ranked  without  a  tour  being  found. 

One  further  comment  on  the  traveling  salesman 
problem  concerns  another  paper  by  Murty   [22]  on 
the  tours  of  the  traveling  salesman  problem.  In 
that  paper,  he  proves  that  all  solutions  that  are 
tours  can  be  found  by  determining  the  vertices  ad- 
jacent to  the  diagonal  assignment  solution,  i.e., 
X..  =  1,  i  =  1,  N.     However ,  due  to  the  large 

number  assignments  adjacent  to  any  other  assign- 
ment, unless  some  efficient  procedure  can  be 
found  for  generating  these  tours  in  a  non-increas- 
ing costwise  manner,   this  does  not  appear  promis- 
ing . 

VI .     Conclusions  and  Directions  for  Future  Research 

From  our  look  at  the  use  of  extreme  point  ranking 
procedures  to  solve  problems  with  concave  objec- 
tive functions  and  linear  constraint  sets,  it 
would  seem  safe  to  say  that  the  efficiency  of  this 
procedure  is  extremely  dependent  upon  the  objec- 
tive function.     In  those  problems  where  the  objec- 
tive function  was  approximately  linear,  the  proce- 
dure could  be  expected  to  be  very  efficient.  How- 
ever, as  the  non-linearity  of  the  objective  func- 
tion increases,  the  efficiency  of  the  procedure 
decreases  markedly. 

Several  areas  of  future  research  into  the  use  of 
this  procedure  present  themselves.     One  would  be 
to  combine  the  use  of  the  Tui  cut  with  methods  for 
efficiently  determining  adjacent  vertices  in  the 
presence  of  degeneracy.     Another  possibly  fertile 
area  of  research  would  be  to  look  more  closely  at 
the  general  linear  fixed  charge  problem  with 
greater-than  constraints.     Finally,  research  might 
be  profitable  in  implementing  Cabot  and  Francis ' 
approach  to  more  general  concave  quadratic  pro- 
gramming problems. 
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ABSTRACT 

During  the  early  1970's,  highly  efficient 
special  purpose  computer  codes  were  developed  for 
solving  capacitated  transshipment  problems  based 
on  primal  extreme  point  algorithms.  Computational 
comparisons  of  these  codes  with  the  best  non- 
extreme  point  codes  designed  for  the  same  class  of 
problems  indicated  that  the  primal  extreme  point- 
based  codes  were  substantially  superior  both  in 
terms  of  computer  time  and  memory  requirements  on 
all  types  of  network  problems.     More  recently, 
specialized  non-extreme  point  codes  have  been 
developed  for  uncapacitated  bipartite  problems — 
notably  assignment  and  semi-assignment  problems. 
Comparisons  of  these  problem  specific  non-extreme 
point  codes  with  the  earlier  general  purpose 
extreme  point  codes  casts  doubt  on  the  earlier 
computational  conclusions.     Consequently,  the 
purposes  of  this  paper  are  to  develop  a  new  ex- 
treme point  algorithm  which  is  specifically 
designed  to  take  advantage  of  bipartite,  boolean 
flows,  and  degeneracy  aspects  of  assignment  and 
semi-assignment  problems  and,  further,   to  conduct 
an  unbiased  comparison  of  the  alternative  algo- 
rithmic approaches  for  solving  assignment  and  semi- 
issignment  problems. 

L.  INTRODUCTION 

The  semi-assignment  problem  is  a  bipartite 
letwork  problem  whose  supply  constraints  are  the 
same  as  those  of  an  assignment  problem  and  whose 
iemand  constraints  are  the  same  as  those  of  a 
ransportation  problem,  or  vice  versa.     In  the 
nidst  of  the  dramatic  advances   [1,2,5,11,14]  in 
letwork  solution  technology  since  1969,  this 
.mportant  member  of  the  network  family  called  the 
emi-assignment  problem  has  received  scant  atten- 
;ion.     Falling  midway  between  the  classical  assign- 
ment problem  and  the  classical  transportation  in 
ts  generality,  it  was  bypassed  alike  by  those 
rho  studied  the  ultra-specialized  assignment 
structures  and  those  who  studied  the  more  general 
)ipartite  transportation  structures. 

The  neglect  of  the  semi-assignment  problem 
s  especially  ironic  in  view  of  the  fact  that  it 
ccupies  one  of  the  singularly  important  niches  in 
he  network  hierarchy.     The  "assignment  half" 
raptures  the  ubiquitous  multiple  choice  structures 
f  capital  budgeting  and  planning  problems  and  the 
ipecial  ordered  set  constraints  of  mixed  integer 


and  combinatorial  programming.     The  "transporta- 
tion half"  captures  arbitrary  upper  and  lower 
bounds  on  disjoint  sums  of  variables,  and  there- 
fore can  provide  valid  relaxations  for  any  mixed 
integer  program  with  imbedded  multiple  choice  and 
special  ordered  set  structures.     Still  more  di- 
rectly, the  semi-assignment  structure  appears  in 
large  scale  scheduling  and  planning  problems  from 
real  world  settings.     For  example,  applications  of 
manpower  planning  (assigning  personnel  to  jobs) , 
scheduling  (assigning  aircraft  to  routes,  trucks 
to  routes,   freight  to  transports,  etc.),  project 
planning  (assigning  project  components  or  sub- 
assemblies to  tasks  over  time) ,  and  a  variety  of 
other  practical  problems  in  planning  logistics 
contain  embedded  semi-assignment  problems. 

Consequently,  the  purposes  of  this  paper  are 
to  develop  a  new  extreme  point  algorithm  (called 
the  alternating  basis  (AB)  algorithm)  which  is 
specifically  designed  to  take  advantage  of  bi- 
partite, boolean  flow,  and  degeneracy  aspects  of 
assignment  and  semi-assignment  problems  and,  fur- 
ther,  to  conduct  a  comparison  of  the  alternative 
algorithmic  approaches  using  codes  designed  for 
solving  these  problems. 

Computational  testing  has  shown  that  approxi- 
mately 90  percent  of  the  pivots  within  special 
purpose  primal  simplex-based  algorithms  [2,11,13] 
are  degenerate  for  assignment  and  semi-assignment 
problems  with  more  than  1000  nodes.     The  primal 
extreme  point  algorithm  presented  in  the  paper  for 
solving  semi-assignment  problems  which  both  cir- 
cumvents and  exploits  degeneracy  can  be  viewed  as 
an  extension  of  the  algorithm  presented  in  [4]  and 
a  specialization  of  the  algorithm  presented  in  [8] . 
One  of  the  principal  features  of  this  algorithm  is 
a  strong  form  of  convergence  that  limits  the  num- 
ber of  degenerate  steps  in  a  far  more  powerful 
way  than  achieved  by  "lexicographic  improvement," 
as  for  example,  in  customary  LP  perturbation 
schemes . 

Each  basis  examined  by  this  algorithm  is 
restricted  to  have  a  certain  topology.     We  show 
that  if  a  semi-assignment  problem  has  an  optimal 
solution,   then  an  optimal  solution  can  be  found 
by  considering  only  bases  of  this  type.     The  major 
mathematical  differences  between  the  AB  algorithm 
and  the  simplex  method  are  (1)   the  rules  of  the 
algorithm  automatically  (without  search)  assure 
that  all  bases  have  the  special  topological  struc- 
ture, and  bypasses  all  other  bases  normally  given 
consideration  by  the  simplex  method;    (2)   the  algo- 
rithm is  finitely  convergent  without  reliance  upon 
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"external"  techniques  (such  as  lexicography  or 
perturbation);  and  (3)  in  certain  cases  non- 
degenerate  basis  exchanges  may  be  recognized  prior 
to  finding  the  representation  of  an  incoming  arc. 
For  these  reasons,   this  algorithm  has  several  com- 
putational advantages  over  the  highly  efficient 
special  purpose  simplex-based  codes  recently  devel- 
oped for  solving  network  problems. 

The  AB  algorithm  also  has  unique  computer 
implementation  properties.     Specifically,   the  data 
required  to  represent  its  bases  are  substantially 
less  than  that  required  for  general  simplex  bases; 
thus,   the  computer  memory  required  to  store  the 
basis  data  is  less  than  that  of  special  purpose 
simplex-based  algorithms.     The  computational  re- 
sults in     section  6    dramatically  demonstrate  the 
power  and  efficiency  of  the  AB  algorithm  over 
other  algorithms  for  solving  assignment  and  semi- 
assignment  problems. 

2.     BACKGROUND  MATERIAL 

An  m  X  n  semi-assignment  problem  may  be  de- 
fined as: 


Minimize  E 

(i,  j)eA 
subject  to: 
Z 

je{j : (i, j)£A} 

E 

i£{i:  (i, j)£A} 


13  13 


13 


b_.,   i£l  =  {1,2,, 


,m} 


13 
X.  . 

13 


=  1,  jeJ  =  {1,2, .. . ,n} 
>  0,  (i,j)£A 


where  I  is  called  the  set  of  origin  nodes,  J  is 
called  the  set  of  destination  nodes,  A  is  the  set 
of  admissible  arcs,  and  Cij  is  the  cost  of  ship- 
ping a  unit  from  origin  node  i  to  destination 
node  j . 

The  dual  of  the  semi-assignment  problem  may 
be  stated  as: 


Maximize  Z 

i£l 
subject  to: 

R.  + 
1 


R.b. 

1  1 


+ 


K. 


3 


^ij' 


(i, j)£A 


where  R.  and  K.  are  called  the  node  potentials  of 
the  origin  and'^ destination  nodes,  respectively. 

An  understanding  of  the  results  of  this  paper 
relies  on  a  familiarity  with  graphical  interpreta- 
tions of  the  semi-assignment  problem  and  how  the 
primal  simplex  method  may  be  applied  to  this  pro- 
blem.    While  these  ideas  are  relatively  direct, 
they,  unfortunately,  are  not  succinctly  itemized 
in  any  references  and  will  be  summarized  in  this 
section  for  completeness. 

The  semi-assignment  problem  may  be  repre- 
sented as  a  bipartite  graph  consisting  of  a  set  of 
origin  nodes  with  supplies  b^^  and  a  set  of  destina- 
tion nodes  with  unit  demands.     Directed  arcs  from 
origin  nodes  to  destination  nodes  accommodate  the 
transmission  of  flow  and  incur  a  cost  if  flow 
exists.     The  objective  is  to  determine  a  set  of 
arc  flows  which  satisfies  the  supply  and  demand 
requirements  at  minimum  total  cost. 

The  bases  of  the  simplex  method  for  solving 
an  m  X  n  semi-assignment  problem  correspond  to 
spanning  trees  with  m  +  n  -  1  arcs.     Exactly  n  of 
the  basic  arcs  have  an  associated  basic  flow  value 


of  one  and  the  other  m  -  1  arcs  have  a  basic  flow 
value  of  zero.     Therefore  each  basic  solution  is 
highly  degenerate  (i.e.,  contains  a  large  number 
of  zero  flows) .     This  often  causes  the  simplex 
method  to  examine  several  alternative  bases  for 
the  same  extreme  point  before  moving  to  an  adja- 
cent extreme  point. 

In  the  graphical  representation  approach,  the 
bases  of  the  simplex  method  for  semi-assignment 
problems  are  normally  kept  as  rooted  trees 
[5 , 7 , 12 , 14,19 ]. Conceptually ,   the  root  node  may  be 
thought  of  as  the  highest  node  in  the  tree  with 
all  of  the  other  nodes  hanging  below  it  on  direct- 
ed paths  leading  downward  from  the  root.  Those 
nodes  in  the  unique  path  from  any  given  node  i  to 
the  root  are  called  the  ancestors  of  node  i,  and 
the  immediate  ancestor  of  node  i  is  called  its 
predecessor . 

Figure  1  illustrates  a  rooted  basis  tree,  the. 
predecessors  of  the  nodes,  and  the  basic  flow 
values,   for  a  3  x  6  semi-assignment  problem. 
Notationally ,  Oi  denotes  the  ith  origin  node  and 
Dj  denotes  the  j th  destination  node.     The  number 
beside  each  link  (arc)   in  the  basis  tree  indicates 
the  flow  on  this  arc  imparted  by  the  basic  solu- 
tion.    Predecessors  of  nodes  are  identified  in  the 
PREDECESSOR  array.     For  example,  as  seen  from  this 
array,   the  predecessor  of  origin  node  2  is  destina^ 
tion  node  1.     The  root  of  the  tree  is  node  01  and 
has  no  predecessor. 


NODE  PREDECESSOR 


01 

None 

02 

01 

03 

D3 

01 

01 

D2 

01 

D3 

01 

D4 

02 

D5 

03 

06 

03 

Figure  1 — Rooted  Basis  Tree 
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It  is  important  to  note  the  direction  of  the  links 
in  Figure  1  correspond  to  the  orientation  induced 
by  the  predecessor  ordering  and  do  not  necessarily 
correspond  to  the  direction  of  the  basis  arcs  in 
the  semi- assignment  problem.     However,   the  direc- 
tion of  the  basic  arcs  are  known  from  the  bi- 
partite property  of  the  semi-assignment  problem; 
i.e.,  all  problem  arcs  lead  from  origin  nodes  to 
destination  nodes. 

In  subsequent  sections  the  term  0-D  link  and 
D-0  link  will  be  used  to  refer  to  links  in  a 
rooted  basis  tree  that  are  directed  from  an  origin 
node  to  a  destination  node  and  vice  versa,  accord- 
ling  to  the  orientation  imparted  to  the  basic  arcs 
iby  the  predecessor  indexing.     For  example,  in 
'Figure  1,  02-D4  is  an  0-D  link  while  Dl-02  is  a 
D-0  link.     Additionally,  basic  arcs  with  a  flow  of 
one  or  zero  will  be  referred  to  as  1-links  and 
0-links,  respectively. 

The  fundamental  pivot  step  of  the  simplex 
method  will  now  be  briefly  reviewed  in  the  graphi- 
cal setting.     Assume  that  a  feasible  starting 
basis  has  been  determined  and  is  represented  as  a 
rooted  tree.     To  evaluate  the  nonbasic  arcs  to 
determine  whether  any  of  them  "price  out"  profit- 
ably, and  therefore  are  candidates  to  enter  the 
basis,  it  is  necessary  to  determine  values  for  the 
dual  variables  R±,   iel,  and  Kj ,  j£J,  which  satisfy 
complementary  slackness;  i.e.,  which  yield 
Rj^  +  Kj  =  Cij  for  each  basic  arc. 

There  is  a  unique  dual  variable  associated 
with  each  node  in  the  basis  tree.     For  this  reason 
the  dual  variables — or  their  values — are  often 
referred  to  as  node  potentials.     Because  of  redun- 
dancy in  the  defining  equations  of  the  semi- 
assignment  problem  (and  in  network  problems  gen- 
erally) ,  one  node  potential  may  be  specified 
arbitrarily.     The  root  node  is  customarily  select- 
ed for  this  purpose  and  assigned  a  potential  of 
zero,  whereupon  the  potentials  of  the  other  nodes 
are  immediately  determined  in  a  cascading  fashion 
by  moving  down  the  tree  and  identifying  the  value 
for  each  node  from  its  predecessor  using  the 
equation        +  Kj  =  c-j^j  .     Highly  efficient  labeling 
procedures  for  traversing  the  tree  to  initialize 
and  update  these  node  potential  values  are  des- 
cribed in  [5,12,14] . 

A  feasible  basic  solution  is  optimal  when  all 
nonbasic  arcs  satisfy  the  dual  feasibility  condi- 
tion R-j^  +  Kj  <  c-j^j  .     If  the  solution  is  not  opti- 
cal, then  an  arc  whose  dual  constraint  is  violated 
(i.e.,   for  which  R^^  +  Kj  >  c^a)  is  selected  to 
snter  the  basis.     The  arc  to  leave  the  basis  is 
determined  by:     (1)  finding  the  unique  path  in 
the  basis  tree,  called  the  basis  equivalent  path, 
tfhich  connects  the  two  nodes  of  the  entering  arc, 
and  (2)  isolating  a  blocking  arc  in  this  path 
jhose  flow  goes  to  zero  ahead  of  (or  at  least  as 
soon  as)  any  others  as  a  result  of  increasing  the 
:low  on  the  entering  arc.     In  the  basis  equiva- 
lent path,  all  arcs  an  even  number  of  links  away 
-  ;rom  the  entering  arc  are  called  even  arcs,  and 
ill  arcs  an  odd  number  of  links  away  are  called 
odd  arcs.     An  increase  in  the  flow  of  the  incoming 
irc  causes  a  corresponding  increase  in  the  flow 
3f  all  even  arcs  and  a  corresponding  decrease  in 
:he  flow  of  all  odd  arcs.     Thus,  if  an  odd  arc 
ilready  has  a  0  flow,  then  such  an  arc  qualifies 
is  a  blocking  arc  and  the  incoming  arc  cannot  be 
assigned  a  positive  flow. 


To  illustrate,  assume  that  the  starting  basis 
is  the  one  given  in  Figure  1  and  the  entering  arc 
is  (03, D4).     The  basis  equivalent  path  for  (03, D4) 
is  D4-02-D1-01-D3-03.     (Note  that  this  path  can  be 
easily  determined  by  tracing  the  chain  of  prede- 
cessors of  03  and  D4  to  their  point  of  intersec- 
tion [5,10,12].)     As  flow  is  increased  on  the 
entering  arc,  the  flow  on  the  odd  arc  (01, Dl)  must 
be  decreased.     Since  its  flow  is  already  zero, 
(01, Dl)  qualifies  as  a  blocking  arc  so  that  when 
arc  (03, D4)   is  brought  into  the  basis,  arc  (01, Dl) 
must  be  dropped.     (There  are  no  other  blocking 
arcs  in  this  case.)     In  addition,   the  pivot  (or 
basis  exchange)  is  degenerate  since  no  flow  change 
occurs . 

Once  the  entering  and  leaving  arcs  are  known, 
the  basis  exchange  is  completed  simply  by  updating 
the  flow  values  on  the  basis  equivalent  path  and 
determining  new  node  potentials  for  the  new  basis 
tree . 

Only  a  subset  of  the  node  potentials  change 
during  a  pivot  and  these  can  be  updated  rather 
than  determined  from  scratch.     This  fact  will  play 
a  crucial  role  in  proving  convergence  of  the  algo- 
rithm to  be  developed. 

To  update  the  node  potentials,  assume  that 
the  nonbasic  arc  (p,q)  is  to  enter  into  the  basis 
and  the  basic  arc   (r,s)   is  to  leave  the  basis.  If 
arc  (r,s)   is  deleted  from  the  basis  (before  adding 
arc  (p,q)),   two  subtrees  are  formed,  each  con- 
taining one  of  the  two  nodes  of  the  incoming  arc 
(p,q).     Let  K  denote  the  subtree  which  does  not 
contain  the  root  node  of  the  full  basis.     The  node 
potentials  for  the  new  basis  may  be  obtained  [12] 
by  updating  only  those  potentials  of  the  nodes  in 
K,  as  follows.     If  p  is  in  K,  subtract 
6  =  Rp  +  Kq  -  Cpq  >  0  from  the  potential  of  each 
origin  node  in  K  and  add  5  to  the  potential  of 
each  destination  node  in  K.     Otherwise,  q  is  in  K 
and  -6  is  used  in  the  above  operations. 

3.     ALTERNATING  PATH  BASIS  DEFINITION  AND 
PROPERTIES 

The  new  alternating  basis  (AB)  algorithm  for 
semi-assignment  problems  developed  in  this  section 
is  similar  to  the  primal  simplex  method  as  des- 
cribed above.     Its  major  mathematical  distinction 
is  that  it  does  not  consider  all  feasible  bases 
to  be  candidates  for  progressing  to  an  optimal 
basis.     That  is,  the  simplex  method  allows  a 
feasible  spanning  tree  of  any  structure  whatsoever 
to  be  included  in  the  set  of  those  that  are 
eligible  for  consideration  as  "improving  bases" 
along  the  path  to  optimality.     However,  it  will  be 
shown  that  if  a  semi-assignment  problem  has  an 
optimal  solution  then  it  also  has  an  optimal 
solution  with  a  unique  basis  tree  structure, 
dubbed  the  alternating  path  (AP)  structure. 
Furthermore,  it  will  be  shown  that  it  is  possible 
to  restrict  attention  at  each  step  to  bases  with 
this  structure.     In  particular,   the  AB  algorithm 
is  a  procedure  designed  to  exploit  the  properties 
of  the  AP  basis  structure  in  a  manner  that  sub- 
stantially reduces  the  impact  of  degeneracy,  the 
number  of  arithmetic  operations,  and  the  data 
storage  locations  required  to  solve  the  semi- 
assignment  problem. 

Definition :     A  rooted  basis  tree  for  a  semi- 
assignment  problem  is  an  alternating  path  (AP) 
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basis  if: 


Let  r  be  a  destination  node  such  that 


1.  The  root  node  is  an  origin  node. 

2.  All  1-links  are  0-D  links. 

3.  All  0-links  are  D-0  links. 

An  example  of  an  AP  basis  is  shown  in  Figure  2. 

The  "alternating  path"  designation  is  applied 
because  every  path  from  a  node  to  any  ancestor 
node  in  the  tree,  or  vice  versa,  is  an  alternating 
path  of  1-links  and  0-links.     Our  attention  will 
chiefly  focus  on  paths  from  nodes  to  their  ances- 
tors (as  would  be  traced  along  a  succession  of 
predecessors) .     A  path  that  begins  at  an  origin 
node  and  ends  at  an  ancestor  destination  node 
will  be  called  "0-AP"  because  it  begins  and  ends 
with  a  0-link.     Similarily,  a  path  that  begins  at 
a  destination  node  and  ends  at  an  ancestor  origin 
node  will  be  called  "1-AP"  because  it  begins  and 
ends  with  a  1-link. 


jeJ  -  J' 


Figure  2 — An  AP  basis  for  a  3  x  6  semi- 
assignment  problem. 

Remark  1 :     The  1-links  of  any  feasible  semi- 
assignment  solution  can  be  augmented  by  0-links 
to  create  an  AP  basis  (e.g.,  by  adding  0-links 
from  destination  nodes  to  origin  nodes  in  any 
fashion  so  that  every  origin  node  except  the  root 
node  has  exactly  one  entering  0-link.)     Note  if 
the  arcs  corresponding  to  the  added  0-links  do  not 
exist  in  the  particular  semi-assignment  problem 
then  a  large  (big  M)  cost  is  assigned  such  links. 

Remark  2 :     There  are  many  semi-assignment  bases 
for  a  given  feasible  solution  that  are  not  AP 
bases.     (For  example,  any  basis  that  has  more  than 
one  0-link  incident  to  an  origin  node  is  not  an 
AP  basis  regardless  of  the  origin  node  chosen  as 
the  root.     Figure  1  is  an  example  of  such  a  basis.) 

Remark  3 :     An  artificially  feasible  AP  basis  may 
always  be  constructed  for  an  m  x  n  semi-assignment 
problem  by  assuming  that  arcs  exist  from  each  ori- 
gin node  to  all  destination  nodes  where  the  non- 
admissible  (artificial)  arcs  have  a  "big  M"  cost, 
the  procedure  is  as  follows. 

Initially  set  J'  =  0,  B  =  0,  and  i  =  1.  Go 
to  step  1. 


Set  B  =  {(i,r)}UB,  x^^  =  1,  J'  =  J'U{r} 

and        =  b-j^  -  1. 

2.  If  bi  ^  0  go  to  step  1.     Otherwise,  go  ti 
step  3. 

3.  If  i  ^  m,  set  i  =  i  +  1  and  go  to  step  1 
Otherwise  set  i  =  2,  J'  =0  and  go  to 
step  A. 

4.  Let  r  be  a  destination  node  such  that 
Cj^j-  =  min  c^j  and  (i,r)^B 

jeJ  -  J' 

Set  J'  =  J'U  {j  :  (i,j)eB},  B  =  {(i,r)}UBi 


and  Xi 


0 


5.  If  i  ?^  m,  set  i  =  i  +  1  and  go  to  step  4 
Otherwise  go  to  step  6. 

6.  Using  B,  create  a  spanning  tree  rooted 
at  node  1.     The  resulting  spanning  tree 
will  be  an  AP  basis. 

Proof : 

The  remark  follows  by  construction. 

Definition :     Relative  to  any  AP  basis,  a  nonbasic 
arc  is  called  a  downward  arc  if  it  connects  a 
destination  node  to  an  ancestor  origin  node,  an 
upward  arc  if  it  connects  an  origin  node  to  an 
ancestor  destination  node.     An  arc  that  connects 
an  origin  node  and  a  destination  node  that  do  not 
have  either  of  these  ancestral  relationships  is 
called  a  cross  arc.     (Note  that  these  are  the  on, 
three  possibilities  for  a  nonbasic  arc  in  a  bi- 
partite network.) 

The  next  two  remarks  point  out  some  importai 
properties  that  can  be  exploited  when  applying 
the  simplex  method  to  an  AP  basis. 

Remark  4 :  When  the  simplex  method  is  applied  to 
an  AP  basis,  a  pivot  is  nondegenerate  if  and  onl; 
if  the  entering  nonbasic  arc  is  a  downward  arc. 

Proof : 

The  remark  relies  on  the  fact  that  a  non-  | 
degenerate  pivot  causes  the  flows  on  the  bo^is 
equivalent  path  to  decrease  and  increase  in^ 
strictly  alternating  fashion  to  the  odd  and  even 
links.     The  "if"  part  of  the  remark  then  follows 
by  observing  that  a  downward  arc  is  1-AP.  The 
"only  if"  part  of  the  remark  follows  from  two 
observations,   first  that  an  upward  arc  is  0-AP, 
and  second  that  a  cross  arc  has  a  0-link  above 
the  origin  node  incident  to  the  entering  arc  (an 
this  arc  is  contained  in  the  basis  equivalent 
path  adjacent  to  one  of  the  nodes  of  the  enterii-: 
arc)  . 

Remark  5 :     When  the  simplex  method  is  applied  to 
an  AP  basis,   the  pivot  can  be  carried  out  to  giv 
a  new  AP  basis  for  any  entering  nonbasic  arc 
simply  by  dropping  the  unique  link  in  the  basis 
equivalent  path  attached  to  the  origin  node  of 
the  entering  arc. 


Proof: 

The  remark  follows  by  observing  that  an  AP 
basis  results  if  a  rooted  tree  is  constructed 
with  its  root  node  equal  to  the  root  node  of  the 
old  AP  basis. 
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Alternating  Basis  (AB)  Algorithm 


Theorem : 


On  the  basis  of  the  preceding  remarks,  the 
rules  of  the  AB  algorithm  can  be  stated  in  an 
extremely  simple  fashion. 

1.  Select  any  feasible  AP -basis  for  the 
semi-assignment  problem  (e.g.,  using 
Remark  3) . 

2.  Successively  apply  the  simplex  pivot 
step  keeping  the  root  node  fixed  and 
picking  the  link  to  leave  according  to 
Remark  5 . 

By  means  of  these  rules,  the  foregoing  ob- 
I   servations  imply  that  the  AB  algorithm  will  pro- 
'   ceed  through  a  sequence  of  AP  bases,  bypassing  all 
,  other  basis  structures.     Further,   these  remarks 
show  that  a  "next"  AP  basis  is  always  accessible 
to  a  given  AP  basis,  so  that  the  method  will  not 
be  compelled  to  stop  prematurely  without  being 
able  to  carry  out  a  pivot  before  the  optimality 
(dual  feasibility)  criteria  are  satisfied.  The 
issue  to  be  resolved  then,  is  whether  the  method 
may  progress  through  a  closed  circle  of  AP  bases 
without  breaking  out,  and  thus  fail  to  converge. 
It  will  be  shown  that  this  cannot  happen,  and 
that,  in  fact,  the  AB  algorithm  is  finitely  con- 
verging without  any  reliance  upon  "external" 
techniques  such  as  perturbation,  as  in  the 
ordinary  simplex  method.     Moreover  it  will  be 
shown  that  the  form  of  convergence  of  the  AB 
algorithm  has  a  particularly  strong  character, 
in  which  origin  node  potentials  and  destination 
node  potentials  each  change  in  a  uniform  direc- 
tion throughout  any  sequence  of  degenerate  pivots. 

These  results  do  not  require  any  restric- 
tions on  the  choice  of  the  incoming  variable. 
For  example,   it  is  not  necessary  to  cull  through 
pivot  possibilities  in  an  attempt  to  find  de- 
generate pivot  candidates.     The  following  lemma 
and  theorem  validate  these  statements. 

Lemma:     A  basis  exchange  with  the  AB  algorithm 
gives  rise  to  a  new  AP  basis  in  which  the  new 
node  potentials  satisfy  the  following  properties: 

a)  For  a  nondegenerate  pivot:     The  changed 
origin  node  potentials  strictly  increase  and  the 
changed  destination  node  potentials  strictly 
decrease . 

b)  For  a  degenerate  pivot:     The  changed 
origin  node  potentials  strictly  decrease  and  the 
changed  destination  node  potentials  strictly 
increase . 

Proof: 

As  already  discussed,  the  node  potential 
values  that  change  may  be  restricted  to  those 
associated  with  the  subtree  K.     By  this  pro- 
cedure, if  subtree  K  contains  the  origin  node 
of  the  entering  arc  then  all  the  origin  node 
potentials  in  K  are  decreased  and  all  destina- 
'tion  node  potentials  in  K  are  increased.  The 
reverse  is  true  if  the  destination  node  of  the 
entering  arc  is  in  subtree  K.     The  lemma  then 
follows  from  Remarks  4  and  5,  which  imply  that 
subtree  K  always  contains  the  destination  node  of 
the  entering  arc  for  a  nondegenerate  pivot  and 
the  origin  node  of  the  entering  arc  for  a  degen- 
erate pivot. 

Our  main  result  may  be  stated  as  follows: 


The  AB  algorithm  will  obtain  an  optimal  solu- 
tion (or  determine  that  the  problem  is  infeasible) 
in  a  finite  number  of  pivots,  regardless  of  which 
dual  infeasible  arc  is  chosen  to  be  the  entering 
arc,  and  without  any  reliance  on  perturbation  or 
lexicographic  order ings. 

Proof: 

It  is  sufficient  to  show  that  the  number  of 
degenerate  pivots  that  occur  between  any  two  non- 
degenerate  pivots  must  be  finite.     This  follows 
from  the  second  half  of  the  lemma.     Note  that  the 
node  potential  assigned  to  the  root  node  never 
changes  when  the  node  potentials  in  subtree  K  are 
updated.     Thus  given  the  constant  node  potential 
for  the  root,   the  other  node  potentials  are 
uniquely  determined  for  each  successive  basis 
(regardless  of  the  procedure  by  which  they  are 
generated) ,  and  the  uniform  decrease  of  origin 
node  potentials  and  the  uniform  increase  of  des- 
tination node  potentials  (for  the  potentials  that 
change)  implies  that  no  basis  can  ever  repeat 
during  an  uninterrupted  sequence  of  degenerate 
pivots.     This  completes  the  proof. 

4.     COMPUTATIONAL  CONSIDERATIONS 

Some  of  the  unique  computational  features  of 
the  AB  method  include: 

a)  It  explicitly  bypasses  all  "non-AP"  basis 
solutions  without  requiring  any  imbedded  search 
procedure  or  computational  tests. 

b)  It  allows  degenerate  pivots  to  be  recog- 
nized and  performed  without  computing  the  repre- 
sentation of  the  entering  arc.     This  can  be  accom- 
plished by  using  the  "cardinality  function"  of 
Srinivasan  and  Thompson  [19]  which  indicates  the 
number  of  nodes  in  the  subtree  of  the  basis  tree 
below  a  given  node.     In  particular,   following  the 
labeling  ideas  recently  proposed  in  [5] ,  upward 
arcs  and  some  cross  arcs  can  be  detected  simply  by 
comparing  the  cardinality  function  values  of  the 
nodes  associated  with  the  entering  arc.     That  is, 
denote  the  cardinality  function  by  f  and  the 
entering  arc  by  (p,q).     If  f(p)  <  f(q)  then  arc 
(p,q)  is  either  an  upward  arc  or  a  cross  arc.  In 
either  case  the  pivot  is  degenerate  and  no  flow 
updating  is  required.     Remark  5,  furthermore, 
directly  specifies  the  link  to  leave  the  basis. 
Thus  a  degenerate  pivot  simply  involves  checking 
the  cardinality  function,  inserting  and  deleting 
the  appropriate  links,  and  updating  the  node 
potential  values. 

c)  Similar  streamlining  can  be  achieved  for 
all  other  pivots.     Specifically,  if  f(q)  <  f(p) 
then  the  appropriate  step  is  to  find  the  first 
node  z  on  the  path  from  q  to  the  root  node  such 
that  f(z)  >  f(p).     If  z  7^  p  then  arc  (p,q)  is  a 
cross  arc  and  thus  the  pivot  may  be  executed  as 
before.     If  z  =  p  then  the  arc   (p,q)  is  a  downward 
arc  and  the  pivot  is  nondegenerate.     Note  that  it 
is  only  in  the  case  of  a  nondegenerate  pivot  that 
the  entire  basis  equivalent  path  of  the  entering 
arc  is  traversed.     Thus  it  is  only  in  this  case 
that  the  complete  representation  of  the  entering 
arc  is  computed.     This  is,  of  course,  substantially 
different  than  for  standard  network  methods. 

Steps  b  and  c  above  can  also  be  accomplished 
by  using  the  "distance  function"  proposed  by 
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Srinivasan  and  Thompson  [19]  which  indicates  the 
number  of  links  in  the  basis  tree  between  a  given 
node  and  the  root  node.     In  particular,   the  dis- 
tance function  may  be  used  in  place  of  the 
cardinality  function  by  reversing  all  the  above 
inequalities . 

d)  Flow  values  on  the  basis  links  never  have 
to  be  checked  to  determine  the  type  of  pivot.  In 
fact,   the  structural  property  of  an  AP  basis, 
whereby  all  1-links  are  0-D  links,  makes  it  un- 
necessary to  store  or  update  any  flow  values. 

e)  All  of  the  above  computational  features 
may  be  further  enhanced  by  observing  that  it  is 
not  necessary  to  store  and  update  a  "full"  basis 
tree.     That  is,  while  the  predecessors  are  re- 
quired for  each  node,   the  cardinality  function 
values  and  the  "thread"  pointers  (commonly  used 
to  traverse  basis  subtrees   [5,7,14]   for  node 
potential  updates)  need  be  kept  for  only  the 
origin  nodes.     Using  this  observation,   the  AB 
algorithm's  basis  data  storage  requirements  are 
roughly  half  that  of  the  most  efficient  imple- 
mentation involving  specializations  of  the  simplex 
method.     Moreover,   the  compression  may  be  used 

to  greatly  reduce  the  number  of  nodes  traversed 
at  each  iteration  when  updating  the  potentials. 
By  maintaining  node  potentials  for  only  the  ori- 
gins and  saving  the  unit  costs  for  the  arcs 
represented  by  the  destinations  and  their  prede- 
cessors, all  data  necessary  for  the  pricing  opera- 
tion is  available.     The  destination  node  potential 
Kj  may  be  computed  as  needed  using  the  comple- 
mentary slackness  relationship  Kj  =  c-j^j  -  R-j^ . 
This  partial  update  of  the  node  potentials  was 
originally  proposed  and  tested  by  Harris   [15]  for 
the  simplex  method  as  applied  to  rectangular 
transportation  problems.     (This  approach  has  also 
been  tested  more  recently  for  transshipment  net- 
works by  Bradley,  Brown,  and  Graves   [7],  who  have 
confirmed  Harris's  findings  of  its  practical 
merit.)     Harris's  proposal,  however,  differs  from 
the  above  in  that  the  thread  pointer  and  potential 
update  operation  is  eliminated  for  those  destina- 
tion nodes  of  the  basis  tree  which  have  no 
descendents.  Consequently ,  Harris's  subtrees  are 
twice  as  large  as  in  our  proposal  and  therefore 
involve  twice  the  updating  effort.     In  the  AB 
algorithm,  maintenance  of  only  the  thread  pointers 
and  node  potentials  associated  with  the  origins 
does  not  degrade  the  efficiency  of  other  parts 
of  the  algorithm.     This  is  due  to  the  unique 
structural  properties  of  the  AP  basis. 


5.     DEVELOPMENT  OF  THE  AB  COMPUTER  CODE 
BY  SUBROUTINE 

The  computer  code  was  written  in  FORTRAN  IV, 
is  an  incore  code,  and  was  initially  tested  using 
the  run  compiler  on  a  CDC  6600  with  a  maximum 
memory  of  130,000  words.     In  this  code,  a  semi- 
assignment  problem  with  M  origins,  N  destinations, 
and  A  arcs  (without  exploiting  the  word  size  of 
the  machine)  requires  AM  +  2N  +  2A  +  5000  words. 
It  would  be  possible  by  exploiting  the  fact  that 
the  costs,  node  numbers  and  node  potentials  are 
integer-valued,   to  store  more  than  one  per  word 
and  in  this  manner  reduce  these  storage  require- 
ments.    However,  our  purpose  was  to  develop  a 
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code  whose  capabilities  did  not  depend  on  the 
unique  characteristics  of  a  particular  computer 
(e.g.,  word  size,  etc.).     The  obvious  advantage  of 
this  approach  is  the  ease  with  which  it  enables 
the  code  to  be  tested  on  different  machines. 
Further,  we  used  a  "manilla"  FORTRAN  IV  so  that 
recoding  to  fit  differing  machine  conventions 
would  be  minimized.     Within  these  constraints,  we 
tried  to  minimize  our  storage  requirements,  at  the 
same  time  making  sure  the  code  could  solve  the 
"thoroughly  general"  semi-assignment  problem. 
The  code  uses  the  predecessor   [10],   thread  [14], 
and  distance   [19]   functions  to  maintain  and  update 
the  basis  data. 

A  variety  of  start  procedures  could  be  used 
to  find  a  starting  AB  basis.     However,  due  to 
severe  time  pressure,  we  only  implemented  the 
start  procedure  of  Remark  3. 

An  important  factor  influencing  computational 
efficiency  is  the  basis  change  criterion.  The 
relevant  tradeoffs  for  the  basis  change  criterion 
involve  time  consumed  in  searching  for  a  new  arc 
to  enter  the  basis  and  the  number  of  pivots  re- 
quired to  find  an  optimal  solution  (time  per 
pivot  versus  total  number  of  pivots) .  Computa- 
tional testing  [2,7,11,13,18,19]  has  shown  that 
the  correct  pivot  criterion  can  reduce  solution 
time  by  as  much  as  a  factor  of  three.  Unfor- 
tunately, we  did  not  have  time  to  test  alternative 
pivot  criteria.     The  code  simply  uses  the  row 
most  negative  rule,  which  was  found  to  be  the  best, 
in  the  studies   [13,19]   for  small  problems.  This 
criterion  scans  the  arcs  of  each  origin  until  it 
encounters  the  first  origin  containing  a  dual 
inf easibility  and  then  selects  the  arc  of  this 
origin  which  violates  dual  feasibility  by  the 
largest  amount  to  enter  the  basis. 

The  program  consists  of  a  main  program  and 
three  subroutines.     The  total  time  spent  in  each 
subroutine  was  recorded  by  calling  a  Real  Time 
Clock  (accurate  to  a  hundredth  of  a  second)  upon 
entering  and  leaving  that  subroutine.     A  count 
was  also  made  of  the  number  of  nondegenerate  and 
degenerate  pivots  performed.     In  the  following 
section,  we  discuss  total  solution  time  (exclusive 
of  input  and  output),   the  start  time,  the  number 
of  nondegenerate  and  degenerate  pivots,   the  total 
pivot  time,  and  the  average  pivot  time  (total 
pivot  time  divided  by  the  number  of  pivots) .  This 
code  will  henceforth  be  referred  to  as  SA-AB  code, 
for  semi-assignment  AB  algorithm  code. 


6.     COMPUTATIONAL  COMPARISON  AND  CODE 
REQUIREMENTS 

6 . 1     Computational  Comparison  of  Several  Codes 

Originally  we  planned  to  compare  the  best 
version  of  the  SA-AB  code  (i.e.,  the  code  result- 
ing after  testing  alternative  start  and  pivot 
rules)  with  other  codes  which  are  based  on  other 
algorithms.     Unfortunately,  we  did  not  have  time 
to  test  alternative  start  and  pivot  procedures. 
Thus,  the  following  code  comparison  is  a  worst 
case  comparison.     That  is,  it  is  comparing  a 
computationally  unimproved  version  of  the  SA-AB 
code  with  the  best  version  of  other  codes. 

The  codes  which  we  obtained  for  comparison 
include  Bennington  [6],    BSRL ,    General  Motors, 
SHARE,  and  SUPERK  [3].     All  of  these  codes  are 
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variants  of  the  out-of-kilter  method  except  for 
Bennington  [6].     In  addition,  the  special  purpose 
primal  simplex  codes  ARC-II   [5],  PNET-I   [11],  and 
SUPERT-2   [2]  were  available  for  testing.  Further, 
the  special  purpose  dual  simplex  DNET  [11]  was 
available  for  testing. 

The  BSRL  code  was  developed  by  T.  Bray  and 
C.  Witzgall  at  the  Boeing  Scientific  Research 
Laboratories,  and  the  General  Motors   (GM)  code  was 
developed  by  Rand  Corp.  and  is  distributed  by  the 
SHARE  user  group. 

All  of  these  codes  are  in-core  codes,  i.e., 
the  program  and  all  of  the  problem  data  simul- 
taneously reside  in  fast-access  memory.     They  are 
all  coded  in  FORTRAN  and  none  of  them  (including 
the  special  purpose  primal  and  dual  simplex  codes) 
have  been  tuned  (optimized)  for  a  particular  com- 
piler.    It  is  important  to  note  that  all  the  codes 
except  for  SUPERT-2  and  SA-AB  codes  are  designed 
to  solve  capacitated  transshipment  problems  and 
are  not  specifically  designed  to  exploit  the 
special  structure  of  semi-assigment  problems. 
Further,  SUPERT-2  is  designed  to  solve  any  uncapaci- 
tated  transportation  problem.     All  of  the  problems 
were  solved  on  the  CDC  6600  at  the  University  of 
Texas  Computation  Center  using  the  RUN  compiler. 
The  computer  jobs  were  executed  during  periods 
when  the  machine  load  was  approximately  the  same, 
and  all  solution  times  are  exclusive  of  input  and 
output;   i.e.,  the  total  time  spent  solving  the 
problem  was  recorded  by  calling  a  Real  Time  Clock 
upon  starting  to  solve  the  problem  and  again  when 
the  solution  was  obtained. 

In  addition  to  these  codes,   the  article  by 
Hatch  [15]  compares  a  primal-dual  code  PD-AAL 
developed  by  Decision  Systems  Associates  (DSA) 
against  PNET-I  on  five  assignment  problems  gen- 
erated by  NETGEN  [17].     Since  the  DSA  code  is 
proprietary,  we  could  not  obtain  a  copy  of  it  for 
comparison.     However,  with  their  published  times 
on  a  CDC-660Q,  we  felt  that  some  comparison  could 
be  made  if  we  ran  the  same  problems  on  a  CDC-6600. 
Thus,  in  order  to  compare  our  results  with  the 
results  of  [15],  we  solved  the  same  five  assign- 
ment problems.     These  results  are  contained  in 
Table  I.     When  comparing  these  results,  it  is 
important  to  note  that  we  are  not  sure  that  the 
code  PD-AAL  is  an  all  FORTRAN  code  and  it  is  our 
understanding  that  the  code  is  fully  optimized 
to  exploit  the  special  hardware  features  of  the 
CDC-6600.     This  type  of  specialization  could 
easily  increase  the  performance  of  all  the  other 
codes  by  a  factor  of  2  or  3. 

A  noteworthy  feature  of  the  computational 
results  is  that  SA-AB,  PD-AAL,   SUPERT-2,  and 
ARC-II  are  in  this  order  superior  to  the  other 
codes.     Based  on  the  sum  of  the  solution  times, 
SA-AB  is  roughly  fifteen  percent  faster  than  its 
closest  competitor  and  is  roughly  fifty  percent 
faster  than  its  next  closest  competitor.  Fur- 
ther,  the  computational  results  indicate  that  the 
AB  algorithm  reduces  the  number  of  pivots  by  25% 
over  the  simplex  algorithm.     Additionally,  the 
results  indicate  that  the  AB  algorithm  not  only 
reduces  the  number  of  degenerate  pivots  but  also 
reduces  the  number  of  nondegenerate  pivots  per- 
formed on  the  denser  problems.     Moreover,  the 
reduction  in  the  nondegenerate  pivots  (that  is, 
the  number  of  extreme  points  visited)  versus  the 
simplex  algorithm  increases  as  the  number  of  arcs 
increases . 


These  results  lead  us  to  believe  that  the  AB 
algorithm  may  be  the  fastest  algorithm  for  solving 
assignment  problems. 

Further,   these  results  indicate  that  this 
first  implementation  of  the  AB  algorithm  is 
1  1/2   times    as  fast  as  the  fastest  uncapacitated 
primal  simplex  transportation  code  SUPERT-2. 
Historically,  it  has  always  been  possible  to 
improve  the  solution  speed  of  the  first  implementa- 
tion of  an  algorithm  by  a  factor  of  2  or  3.  Thus, 
coupling  this  with  the  fact  that  the  start  and 
pivot  rules  of  SA-AB  have  not  been  computationally 
investigated,  it  appears  that  the  AB  algorithm 
is  probably  twice  as  fast  as  other  algorithms  for 
solving  assignment  problems.     This  result  is 
extremely  important  since  it  completely  contra- 
dicts the  older  folklore  that  primal-dual  and 
out-of-kilter  algorithms  are  the  fastest  and  the 
more  recent  folklore  that  special  purpose  primal 
simplex  based  codes  are  the  fastest. 

Looking  at  the  out-of-kilter  and  dual  simplex 
codes'  solution  times,  it  is  interesting  to  note 
that  these  solution  times  are  much  slower  than  the 
other  codes;  further,  their  times  are  much  more 
dependent  on  the  number  of  arcs   (holding  all 
other  parameters  of  the  problem  constant)  than 
the  other  codes.     Another  important  result  which 
can  be  gleaned  from  Table  I  is  that  the  dual 
simplex  method  is  not  competitive  with  any  of  the 
other  algorithms. 

After  comparing  all  of  the  codes  on  the 
assignment  problems,  we  choose  from  available  codes 
the  three  fastest  ones   (note  that  the  proprietary 
PD-AAL  code  was  not  available)  to  compare  on  semi- 
assignment  problems.     In  distinction  to  the  assign- 
ment test  problems,  the  specification  of  the  semi- 
assignment  test  problems  vary  greatly  in  both  the 
number  of  nodes  and  arcs.     As  shown  in  Table  II, 
the  eleven  test  problems  vary  in  size  from  50 
origins  and  500  destinations  to  400  origins  and 
4000  destinations.     The  number  of  arcs  varies 
from  2000  arcs  to  20,000  arcs.     The  cost  range 
of  the  test  problems  is  1-1000. 

The  solution  times  in  Table  II  again  indicate 
that  the  AB  algorithm  is  substantially  superior 
to  the  simplex  algorithm.     The  SA-AB  times  strictly 
dominate  the  times  of  the  other  codes.  Comparing 
the  sum  of  the  total  solution  times,  the  SA-AB 
code  is  2.55  times  faster  than  one  of  the  fastest 
simplex  transportation  codes,  SUPERT-2.     This  is 
a  rather  startling  result  since  some  people  have 
indicated  that  they  believe  that  the  speed  of  the 
simplex  based  codes  are  approaching  the  computa- 
tional limits  of  these  problems. 

Another  noteworthy  feature  of  the  computa- 
tional results  is  that  the  SA-AB  code  is  decidely 
superior  on  the  largest  test  problems.  Consider 
the  last  two  test  problems  which  each  have  400 
origins  and  4000  destinations.     On  these  problems, 
the  SA-AB  is  a  full  three  times  faster  than  the 
SUPERT-2  code. 

Based  on  the  results  of  this  testing,  the 
fact  that  the  SA-AB  code  is  the  first  implementa- 
tion of  the  AB  algorithm,  and  the  fact  that  the 
SA-AB  code  has  not  been  computationally  refined 
(i.e.,  alternative  start  and  pivot  rules  were 
not  examined) ,  we  believe  that  the  AB  algorithm 
is  currently  the  fastest  algorithm  for  solving 
both  assignment  and  semi-assignment  problems. 
This  result  is  extremely  encouraging  since  we  have 
recently  extended  the  concepts  of  this  algorithm 
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TABLE  I 


TOTAL  SOLUTION  TIMES  ON  200  X  200  ASSIGNMENT  PROBLEMS 
(IN  SECONDS)  ON  A  CDC  6600  WITH  A  COST  RANGE  OF  1-100 


No.  of  ARCS 

1500 

2250 

3000 

3750 

4500 

1        Sum  of  Times 

ARC  II 

1 

J. 

9 
Z 

/,  7 

z 

r  -J 

3 

13 

11.67 

BENN 

X7 

44 

20 

31 

24 

92 

27 

40 

~ 

BSRL 

30 

39 

22 

08 

20 

02 

23 

XI 

21 

08 

116.68 

DNET 

19 

87 

26 

58 

27 

98 

30 

15 

31 

57 

136.15 

GM 

35 

67 

28 

43 

31 

39 

18 

62 

23 

48 

137. 59 

PD-AAL 

1 

63 

1 

14 

1 

89 

1 

29 

1 

80 

7.75 

PNET-I 

2 

31 

3 

71 

3 

47 

3 

44 

4 

79 

17 .  72 

SA-AB 

97 

1. 

12 

1. 

48 

1. 

61 

1. 

68 

6.86 

SHARE 

19 

93 

21 

17 

25 

81 

24 

95 

27 

05 

118.91 

SUPERK 

6 

44 

6 

47 

7 

25 

6 

95 

7 

56 

34.67 

SUPERT-2 

1 

1. 

57 

1 

98 

2 

17 

2 

53 

9.51 

TABLE  II 

TOTAL  SOLUTION  TIMES  ON  SEMI-ASSIGNMENT  PROBLEMS 
(IN  SECONDS)  ON  A  CDC  6600  WITH  A  COST  RANGE  OF  1-1000 


No. 

of 

Vodes  m  x  n 

No.  of  Arcs 

ARC- II 

SA- 

-AB 

SUPERT-2 

50 

X 

500 

2,000 

2.37 

1 

14 

2 

85 

50 

X 

500 

5,000 

3.53 

2 

29 

3 

51 

50 

X 

500 

10,000 

6.56 

4 

00 

5 

53 

50 

X 

1000 

4,000 

4.27 

2 

75 

6 

15 

50 

X 

1000 

10,000 

9.34 

5 

64 

11 

64 

50 

X 

1000 

20,000 

DNR 

8 

17 

17 

02 

100 

X 

1000 

4,000 

5.59 

3 

20 

6 

22 

100 

X 

1000 

10,000 

10.25 

5 

62 

10 

99 

100 

X 

1000 

16,000 

15.40 

8 

16 

14 

27 

400 

X 

4000 

10,000 

DNR 

21 

47 

61 

13 

400 

X 

4000 

16,000 

DNR 

27 

81 

91 

27 

SUM 

OF  TOTAL  TIMES 

90 

25 

230 

58 

DNR — Did  not  run  as  a  result  of  memory  limitations. 
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TABLE  III 
CODE  SPECIFICATIONS 


Developer 


Name 


Type 


Number  of  Arrays 


1.  Barr  SUPERT- 

2.  Barr,  Glover,  ARC-II 
Klingman 

3.  Barr,  Glover,  SA-AB 
Klingman 

4.  Barr,  Glover,  SUPERK 
Klingman 

5.  Bennington  BENN 

6.  Bray  and  Witzgall  BSRL 

7 .  Clasen  SHARE 

8.  Decision  System  PD-AAL 
Associates 

9.  Glover,  Karney,  DNET 
Klingman 

10.  Glover,  Karney,  PNET-I 
Klingman,  Stutz 

11.  General  Motors  GM 


Primal  Simplex  Transportation 
Primal  Simplex  Network 

AB  Algorithm 

Out-of-kilter 

Non-simplex 
Out-of-kilter 
Out-of-kilter 
Primal-Dual 

Dual  Simplex  network 

Primal  Simplex  network 

Out-of-kilter 


5  (M  +  N)  +  2A 
7   (M  +  N)  +  2A 

4M  +  2N  +  2A 

4  (M  +  N)  +  9A 

6  (M  +  N)  + llA 
6  (M  +  N)  +  8A 

6  (M  +  N)  +  7A 
Not  available 

7  (M  +  N)  +  2A 
6  (M  +  N)  +  2A 
3  (M  +  N)  +  5A 


M — Origin  Length 

N — Destination  Length 

A — Arc  Length 


to  arbitrary  capacitated  transportation,  trans- 
shipment, and  generalized  transshipment  problems. 

6 . 2    Memory  Requirements  of  the  Codes 

Table  III  indicates  the  number  of  origin, 
destination,  and  arc  length  arrays  required  in 
each  of  the  codes  testing  for  solving  assignment 
and  semi-assignment  problems  except  for  the 
PD-AAL  code.     The  storage  requirements  of  this 
code  were  not  available.     It  should  be  noted  that 
memory  requirements  of  all  of  the  codes  tested 
were  quite  close  (within  1500  words)  excluding 
the  array  requirements.     Thus,   the  important 
factor  in  comparing  the  codes  is  the  number  of 
origin,  destination,  and  arc  length  arrays. 

Looking  at  Table  III  and  keeping  in  mind 
that  any  meaningful  problem  has  to  have  more 
arcs  than  nodes,  it  is  clear  that  the  AB,  primal 
simplex,  and  dual  simplex  codes  have  a  distinct 
advantage  (in  terms  of  memory  requirements)  over 
all  of  the  other  codes.     Further,   this  advantage 
greatly  increases  as  the  number  of  arcs  increase. 
For  example,  consider  a  problem  which  has  10 
times  as  many  arcs  as  nodes.     ARC-II,  DNET,  PNET-I, 
SUPERT-2,  or  SA-AB  require  only  about  one-half 
the  memory  that  the  best  (in  terms  of  memory 
requirements)  of  the  other  codes.     This  enables 
the  AB  and  simplex  based  codes  to  solve  much 
larger  problems  than  other  codes.     Further,  it 
is  important  to  note  that  the  AB  based  code,  SA-AB, 
requires  the  least  amount  of  memory.     Thus,  it 
appears  that  the  AB  algorithm  is  superior  to  other 
algorithms  both  in  terms  of  solution  speed  and 
computer  storage  requirements. 
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RECENT  DEVELOPMENTS   IN  VEHICLE  ROUTING 
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Abstract 

The  vehicle     routing      problem  has  been  regarded 
as  a  thorny  mathematical  programming  problem  for 
quite  some  time.     However,  only  recently  has  this 
practical  problem  received  widespread  attention  in 
the  literature  and  research  efforts  in  this  area 
j  continue.     Largely  this  stems  from  the  emergence  of 
algorithmic  efficiency  as  a  major  concern  in  Oper- 
ations Research.     In  addition,  the  new  respecta- 
bility of  heuristic  approaches  in  response  to  the 
large  number  of  intractable  NP-complete  problems 
has  attracted  many  researchers.     In  this  paper,  we 
survey  some  of  the  recent  developments  in  vehicle 
routing.     We  focus  on  broad  issues  such  as,  for 
example,  computational  experiments  with  various 
algorithms.     For  vehicle  routing  problems,  the 
nimber  of  feasible  solutions  is  a  key  factor  in 
algorithmic  performance.     We  demonstrate  an  ap- 
proach for  calculating  this  number.     Finally,  var- 
iations dealing  with  arc  routing  and  extensions 
concerned  with  scheduling  are  introduced. 

Introduction 

j.The  Vehicle  Routing  Problem  (VRP)   seems  first  to 
I  have  been  mentioned  by  Garvin  et  al .    [14]  in 
1957  and  by  Dantzig  and  Ramser    [11]    in  1959. 
However,  only  recently  has  this  problem  received 
widespread  attention  in  the  literature  and 
research  efforts  in  this  area  continue.  Largely 
this  stems  from  the  emergence  of  algorithmic 
implementation  as  a  major  concern  in  Operations 
Research.     In  addition,  the  new  respectability 
of  heuristic  approaches  in  response  to  the  large 
mamber  of  intractable  NP-complete  problems  such 
as  the  knapsack  problem  and  the  Traveling  Sales- 
man Problem  has  attracted  many  researchers .  In 
this  paper,  we  survey  some  of  the  historical 
developments  in  the  vehicle  routing  literature. 
.The  intention  is  not  to  be  encyclopedic;  the  sur- 
veys by  Bodin    [5],  Christofides    [7],  Golden  [17], 
•and  Turner  et  al.    [38]   discuss  many  details 
which  we  will  not  cover  here.     Rather,  we  hope 
to  focus  on  broad  issues  and  recent  developments 
and  provide  an  overview  of  vehicle  routing.  In 
I  addition,  we  count  feasible  solutions  to  the  VRP 
I  and  touch  upon  several  new  directions  in  vehicle 

I  routing  research. 

I 

Vehicle  Routing  Problems,   sometimes  referred  to  as 
truck-dispatching  problems,  are  almost  always  en- 
countered by  complex  organizations,   and  reliable 


procedures  for  dealing  with  them  are  needed.  Re- 
cently, higher  vehicle  costs  due  to  increased  oil 
prices  and  rising  truck  drivers'   salaries  have  mo- 
tivated management  to  study  these  issues  more  care- 
fully. 

There  may  be  several  hundred  demand  points  in  and 
around  a  city.     The  Vehicle  Routing  Problem  is  to 
obtain  a  set  of  delivery  routes  from  a  central  de- 
pot to  the  various  demand  points  each  of  which  has 
known  requirements,  which  minimizes  the  total 
distance  covered  by  the  entire  fleet.  Vehicles 
have  capacities  and  maximum  route  time  constraints. 
In  addition,  the  fleet  of  vehicles  may  be  hetero- 
geneous with  respect  to  these  characteristics.  It 
should  be  noted  that  there  are  a  number  of  goals 
involved  which  include  minimizing  the  number  of 
vehicles  required  in  the  fleet,  minimizing  travel 
time  or  travel  distance  by  vehicles,  and  generally, 
providing  efficient  service  to  the  customers  as 
quickly  and  cost-effectively  as  possible.  The 
specific  objective  function  we  consider  explicitly 
(minimize  travel  distance)   is  both  reasonable  and 
tractable.     All  vehicles  depart  from  the  central 
depot,  make  a  tour  of  a  subset  of  the  demand  nodes, 
and  return  to  the  central  depot. 

Examples  of  Vehicle  Routing  Problems  include  muni- 
cipal waste  collection  studied  by  Beltrami  and 
Bodin   [3] ,  fuel  oil  delivery  studied  by  Garvin  et 
al.    [14] ,  newspaper  distribution  studied  by  Golden, 
Magnanti,  and  Nguyen   [19] ,  and  routing  of  school 
buses  studied  by  Newton  and  Thomas    [26] .     It  should 
be  clear  that  all  of  these  vehicle  routing  problems 
are  more  or  less  the  same.     Operationally  the 
examples  may  seem  different,  but  theoretically 
they  can  be  thought  of  as  equivalent. 

Proposed  techniques  for  solving  problems  of  this 
sort  have  fallen  into  two  classes — those  which 
solve  the  problem  optimally  by  branch  and  bound 
techniques    (Christofides  and  Eilon   [8] ,  Eilon  et 
al .    [12] ,  and  Pierce   [30] ),  and  those  which  solve 
the  problem  heuristically   (such  as  Christofides 
and  Eilon   [9] ,  Clarke  and  Wright   [10] ,  Dantzig  and 
Ramser   [11] ,  Gaskell    [15] ,  Gillett  and  Miller   [16] , 
Golden,  Magnanti,  and  Nguyen    [19],  Holmes  and 
Parker   [20] ,  Russell   [34] ,  Tillman  and  Cochran 
[37] ,  Tyagi    [39] ,  and  Yellow   [41] ) .     In  a  loose 
sense,  heuristic  algorithms  represent  sets  of  rules 
which  produce  good  solutions  to  given  combinator- 
ial programming  problems,  but  not  necessarily  the 
best  possible    (optimal)   solutions.  Since 
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algorithms  which  are  guaranteed  to  achieve  optimal- 
ity  are  viable  only  for  very  small  problems,  most 
authors  prefer  to  concentrate  on  the  study  of 
heuristic  algorithms. 

VRP  Formulation 

Garvin  et  al.    [14] ,  who  treated  an  application  in 
the  oil  industry,  introduced  a  mixed  integer 
programming  heterogeneous  fleet  problem  formula- 
tion of  the  VRP.     This  formulation  has  on  the 
order  of  n^  constraints  where  n  is  the  number  of 
nodes . 

Alternatively,  when  the  fleet  is  homogeneous,  the 
VPtP  may  be  expressed  as  the  following  large-scale 
linear-integer  program. 

m 

Minimize  Z  c^x . 


Subject  to 


Z  6.  .X.  =  1,  i 


1,. 


where 


A  if  ith 
■N      tour  j 
otherwi 

y  1  if  tour  j  i 
\o  otherwise 


customer  is  on 


s  chosen 


the  length  of  tour  j 

number  of  nodes 

total  number  of  feasible 
tours  . 


This  set  partitioning  formulation  was  first  given 
by  Balinski  and  Quandt   [2]   in  1964.     The  objective 
function  minimized  total  distance  traveled.  Clearly 
such  a  formulation  is  most  valuable  for  very  small 
Vehicle  Routing  Problems  due  to  the  large  number  of 
variables   (feasible  tours) .     This  formulation  may 
be  useful  as  a  conceptual  tool,  whereas  the  formula- 
tion given  by  Garvin  et  al.   is  more  explicit  and 
might  be  better  suited  for  integer  programming 
techniques.     In  the  worst  case 


k=l  ^ 


k=l  ^ 


For  example  with  n  =  25,  there  may  be  more  than  33 
million  feasible  tours.     Balinski  and  Quandt  report 
success  in  solving  VRP's  using  a  cutting  plane  algo- 
rithm with  problems  for  which  n  <  15  and  m  <  300. 

The  set  partitioning  problem  belongs  to  the  NP-com- 
plete  problem  class  along  with  the  VRP.     A  problem 
is  NP-complete  if  it  can  be  shown  to  be  equivalent 
to  the  Traveling  Salesman  Problem  (TSP)   in  the 
following  sense.     If  one  can  be  solved  optimally 
by  a  polynomial  algorithm,  then  both  can  be  solved 
by  polynomial  algorithms.     A  polynomial  algorithm 
is  one  in  which  running  time  is  proportional  to  a 
polynomial  function  of  the     input.     It  seems 


likely,  however,  that  all  NP-complete  problems  are 
intractable  and  any  optimal  algorithm  must  be  of  at 
least  exponential  time  complexity. 

We  present  below  a  complete  linear-integer  program- 
ming VRP  formulation  with  structure  more  closely 
related  to  the  fundamental  TSP  formulation.  This 
formulation  was  introduced  by  Golden,  Magnanti,  and 
Nguyen   [19] . 

We  point  out  that  in  the  otherwise  excellent  survey 
by  Turner  et  al.    [38],  an  incorrect  integer  program- 
ming formulation  is  given.     Unfortunately,  their 
formulation  fails  to  prevent  the  formation  of  sub- 
tours.     We  will  refer  to  the  following  formulation 
as  the  VRP  formulation: 


Minimize 
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(for  2  <_i  7^  j  <  n  for  some  real 
numbers  y^^) 

x'^    =  0  or  1     (for  all  i,j,k), 
ij 

n  =  number  of  nodes 
NV  =  number  of  vehicles 
P]^  =  capacity  of  vehicle  k 
T)^  =  maximum  time  allowed  for  route  of 
vehicle  k 


(6) 


(7) 


(8) 


(9) 


(10) 


demand  at  node  i  (Q 


0) 


t^  =  time  required  for  vehicle  k  to 

deliver  or  collect  at  node  i  (t-|^=0) 
t'^.=  travel  time  for  vehicle  k  from 
■'■^     node  i  to  node  j    (t^^^  =  <^  ) 
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distance  from  node  i  to  node  j 

1  if  arc   {i,j)   is  traversed  by 
_.      vehicle  k 
ij      [0  otherwise. 

Node  1  represents  the  central  depot.     Equation  (1) 
states  that  total  distance  is  to  be  minimized. 
Equations   (2)  and  (3)  ensure  that  each  demand  node 
is  served  by  one  vehicle  and  only  one  vehicle. 
Route  continuity  is  represented  by  equations    (4) , 
i.e. ,  if  a  vehicle  enters  a  demand  node,  it  must 
exit  from  that  node.     Equations   (5)   are  the  vehi- 
cle capacity  constraints;  similarly,  equations  (6) 
are  the  total  elapsed  route  time  constraints.  For 
instance,  a  newspaper  delivery  truck  may  be  re- 
stricted from  spending  more  than  one  hour  on  a 
tour  in  order  that  the  maximum  time  interval  from 
press  to  street  be  made  as  short  as  possible. 
Equations   (7)   and   (8)  make  certain  that  vehicle 
availability  is  not  exceeded.     Equations    (9)  are 
the  subtour-breaking  constraints. 

Since    (1)   and   (4)   imply    (3),   and   (4)   and   (7)  imply 
(8) ,  the  equations   (3)  and   (8)  are  redundant  and 
can  be  excluded  from  the  model;  in  any  case,  the 
linear-integer  program  is  awesome  in  size.  Typi- 
cal problems  involve  hundreds  of  demand  points. 

Historical  Survey 

Dantzig  and  Ramser  [11]  were  the  first  researchers 
to  obtain  a  method  for  solving  the  VRP  approxi- 
mately.    In  1964  Clarke  and  Wright   [10]  extended 
the  Dantzig  and  Ramser  model  to  consider  routing 
for  a  fleet  of  vehicles  of  varying  capacities. 
The  oew  procedure  borrowed  the  concept  of  node 
aggregation  from  the  earlier  method  and  it  seemed 
to  yield  better  results.     Undoubtedly,  the  Clarke- 
Wright  "savings"  method  is  the  most  widely  used 
and  cited  vehicle  routing  algorithm.     It  in- 
volves first  evaluating  all  potential  savings 
Sj^  ■  =  d-j^^  +  d-j^j  -  d^j   from  linking  two  nodes  i 
and  j ,  and  then  joining  those  nodes  with  the  high- 
est feasible  savings  at  each  iteration.  Initially, 
each  node  is  served  individually  from  the  central 
depot.     This  heuristic  has  been  analyzed  and  modi- 
fied extensively;  Gaskell   [15]   in  1967  experiment- 
ed with  another  savings  function  TT^^j  =  d^^  +  d-|^j  - 
2dj^j  with  limited  success.     In  the  above  savings 
functions,  d^j  denotes  the  distance  between  nodes 
i  and  j . 

Tyagi   [39]   in  1958  presented  a  method  which  groups 
demand  points  in  the  following  very  straightfor- 
ward fashion.     Starting  with  node  2   (node  1  is  the 
central  depot)  we  find  its  nearest  neighbor,  say 
node  k,  subject  to  the  vehicle  capacity  restric- 
tions  (we  assume  here  that  all  vehicle  capacities 
are  the  same) .     We  next  find  the  nearest  neighbor 
to  node  k,  say  node  j,  subject  to  the  capacity 
restrictions  and  continue  until  adding  a  nearest 
neighbor  would  result  in  a  tour  exceeding  the  max- 
imum vehicle  capacity.     Rules  of  thumb  are  speci- 
fied to  minimize  the  frequency  with  which  a  group 
will  consist  of  only  one  delivery  point,  especial- 
ly, in  the  case  where  the  delivery  is  small  or 
the  distance  from  the  central  depot  to  this  point 
is  more  than  half  the  distance  from  the  farthest 
point  to  the  central  depot.     Having  grouped  the 
delivery  points  into  m  tours,  the  vehicle  dis- 
patching problem  reduces  to  m  TSP's,  one  for  each 


tour.  Computational  aspects  of  this  algorithm  are 
discussed  in  Golden,  Magnanti,   and  Nguyen  [19]. 

In  1969,  Christofides  and  Eilon  [8]   and  Pierce  [30] 
presented  algorithms  for  finding  optimal  solutions 
to  small  VRP ' s .     Christofides  and  Eilon  present  a 
branch  and  bound  strategy  similar  to  the  well- 
known  Little  et  al.    [24]   approach  for  TSP's. 
Pierce  discusses  direct-search,  combinatorial  pro- 
gramming algorithms  for  a  host  of  truck-dispatch- 
ing problems.     Delivery  deadlines,  earliest  deliv- 
ery time  constraints ,   carrier  capacity  constraints 
optional  deliveries,   and  generalized  objective 
functions  are  developed  in  detail  in  his  ambitious 
paper . 

Yellow   [41],   in  dealing  with  Euclidean  networks, 
eliminates  the  need  to  initially  compute  the  half- 
matrix  of  savings  values  by  applying  a  simple  geo- 
metrical search  technique  based  on  the  polar  coor- 
dinates of  the  demand  points.     In  this  1970  paper, 
he  modifies  the  savings  method  to  produce  a  sequen 
tial  Clarke-Wright  method  where  we  proceed  as  in 
Tyagi 's  algorithm  except  that  maximum  savings 
rather  than  minimum  distance  is  our  criterion. 
That  is,  if  we  have  just  linked  node  i  to  the  sub- 
tour,  we  next  find  the  largest  feasible  savings 
between  node  i  and  a  nearby  node  j ,  not  yet  in 
the  subtour. 

The  excellent  book  Distribution  Management,  pub- 
lished in  1971,  studies  TSP's  and  VRP's  in  great 
detail     [12] .     Eilon  et  al.  include  an  "r-optimal" 
procedure  for  the  VRP  which  borrows  much  from 
Lin's  TSP  approach   [22].     Their  procedure  begins 
with  a  feasible  solution  and  tests  perturbations 
of  r  arcs  at  a  time  to  obtain  r-optimality .  For 
example,  if  r  =  2  they  examine  each  pair  of  arcs 
to  see  if  it  can  be  replaced  by  another  pair  such 
that  feasibility  is  preserved  and  total  distance 
is  decreased.     Lin's  approach  has  been  extended  by 
Lin  and  Kernighan   [23]    in  1973.     It  should  be 
noted  that  many  of  the  book's  computational  argu- 
ments are  already  out  of  date. 

Krolak  et  al.    [21]   in  1972  suggested  a  man-machine 
interactive  approach  which  has  been  successful  on 
some  larger  problems  but  seems  to  require  an  ex- 
cessive amount  of  man-machine  time.     In  1974, 
Gillett  and  Miller   [15]  proposed  a  "sweep"  algo- 
rithm for  Euclidean  networks  which  ranks  and  links 
demand  points  by  their  polar  coordinate  angle. 
Newton  and  Thomas  have  recently  investigated  the 
VRP  in  connection  with  school  bus  routing,  dealing 
with  a  multi-school  system  [26] .     Their  approach 
for  each  school  has  been  to  obtain  a  near-minimum 
single  trip  solution  to  the  relaxed  TSP,  and  then 
partition  the  resulting  tour  into  subtours  satis- 
fying the  VRP  constraints    [25] . 

In  Beltrami  and  Bodin's  very  comprehensive  1974 
paper   [3] ,  a  variety  of  problems  concerning  muni- 
cipal waste  collection  are  explored.     For  example, 
they  consider  the  potential  benefit  of  allowing 
more  than  one  vehicle  to  visit  a  site  on  the  same 
day,  each  of  which  services  part  of  the  demand  at 
the  site.     In  addition,  some  pathological  features 
of  the  Clarke-Wright  heuristics  are  pointed  out. 

In  1975,  the  emphasis  in  vehicle  routing  was  on 
the  development  of  codes  to  solve  large-scale 
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VRP's.     Orloff,  who  had  previously  formulated  the 
General  Routing  Problem   (GRP) ,  extended  his  model 
to  handle  a  fleet  of  vehicles.     The  GRP  concerns 
finding  a  minimum  cost  cycle  which  traverses  every 
arc  and  every  node  in  subsets  R  and  Q  of  the  arcs 
and  nodes  respectively.     We  will  discuss  node 
routing  and  arc  routing  later  in  this  paper.  For 
now  we  remark  that  the  GRP  formalizes  this  dichot- 
omy and,  at  the  same  time,  defines  a  continuum 
between  these  two  extremes .     Node  routing 
problems  tend  to  be  much  more  difficult  to  solve 
than  arc  routing  problems.     Recently,  Orloff  and 
Caprera  [29]  have  exploited  this  observation  in 
a  heuristic  algorithm  (of  the  Lin-Kernighan 
variety)   for  solving  large  GRP's.     The  strategy 
is  to  convert  required  nodes  to  required  arcs 
whenever  possible.     In  other  words,  requiring 
certain  major  roads  in  a  transportation  network 
to  be  traversed  in  a  tour  limits  the  degrees  of 
freedom  in  an  algorithm  and  reduces  running  times. 
Large  VRP's  may  be  exploited  similarly. 

In  1974,  Russell   [33]  presented  a  modified 
version  of  the  Gillett  and  Miller  algorithm. 
More  recently,  Russell  [34]  has  presented  a 
VRP  heuristic  algorithm  for  the  special  case 
where  the  number  of  tours  is  pre-specif ied.  Also 
various  sequencing  and  due-date  constraints  are 
considered.     The  algorithm  is  an  extension  of 
the  Lin-Kernighan  heuristic   [23]  ,  and  has 
solved  a  159  node  problem  in  under  9  minutes 
on  an  IBM  370/168.     Robbins  et  al.    [32] ,  also 
in  1975,  have  assembled  a  tour  construction-tour 
improvement  code  which  looks  quite  promising. 
Initially,  a  tour  is  constructed  using  the 
Clarke-Wright  approach.     Next,  this  tour  is 
improved  by  Lin's  2-opt  procedure   [22].     In  five 
of  the  seven  cases  cited,  the  improvement  is  less 
than  1%;     the  largest  improvement  is  3.2%.  A 
150  node  VRP  was  solved  in  about  37  seconds. 
This  techniques  has  been  applied  by  Vu  and  Turner 
[40]  to  a  rural  refuse  collection  problem.  The 
code  presented  by  Golden,  Magnanti,  and  Nguyen 
[19]  performs  from  one  to  two  orders  of  magnitude 
faster  than  these  recently  developed  codes 
although  it  may  not  be  as  accurate. 

In  their  paper,  the  authors  emphasize  data 
structures  and  list  processing  and  present  a  new 
implementation  of  the  Clarke-Wright  algorithm 
which  is  motivated  by   (i)  optimality  considera- 
tions,   (ii)  storage  considerations,  and  (iii) 
sorting  considerations  and  program  running  time. 
The  Clarke-Wright  algorithm  is  modified  in  the 
following  three  ways : 

(1)  by  using  a  route  shape  parameter  y  to 
define  a  modified  savings 

Sj^j  =  d-[^^       "^l  ■  ~  ^"^i j  ^^'^  finding 
the  best  route-* structure  obtained  as 
the  parameter  is  varied; 

(2)  by  considering  savings  only  between 
nodes  that  are  "close"  to  each  other; 

(3)  by  storing  savings  s. .   in  a  heap 
structure  to  reduce  comparison 
operations  and  ease  access. 

This  modified  algorithm  has  been  applied  to  a 
Newspaper  distribution  problem  containing 
nearly  600  drop  points.     The  problem  was  solved 
in  less  that  20  seconds  of  execution  time  on  an 
IBM  370/168. 


Holmes  and  Parker   [20],   in  1976,  have  suggested 
another  improvement  to  the  Clarke-Wright  algo- 
rithm.    The  modification  entails  a  progression  from 
one  feasible  solution  to  a  next,  until  we  find  a 
local  minimum  with  the  following  property.     If  we 
prohibit  any  one  edge  in  the  current  solution  from 
appearing  in  the  next  solution   (by  setting  its 
savings  value  to  zero) ,  and  reapply  the  Clarke- 
Wright  algorithm,  we  cannot  improve  upon  the  total 
distance.     In  terms  of  accuracy,  this  concept  of 
local  minimum  seems  to  yield  nice  results.  Com- 
putationally, however,  this  approach  is  relatively 
slow. 

Vehicle  routing  algorithms  have  recently  "come  of 
age"  in  the  sense  that  they  are  now  capable  of 
solving  some  large-scale  real-world  problems.  As 
indicated  in  this  survey ,  the  development  has  been 
a  slow  process.     We  can  categorize  the  chronologi- 
cal stages  of  growth  as  follows: 

1)  early  formulations; 

2)  early  algorithms  for  hand  computations; 

3)  exact  algorithms  for  small  problems; 

4)  more  efficient,  computerized  algorithms  with 
an  emphasis  on  implementation  for  moderate 
size  problems; 

5)  second  generation  codes,  much  faster — for 
large  problems; 

6)  real-world  applications. 

There  are,  of  course,  many  important  variants  of 
the  VRP,  some  of  which  we  will  mention,  for  which 
no  satisfactory  solution  techniques  yet  exist. 

We  remark  in  closing  this  section  that  many  of  the 
papers  mentioned  are  no  more  than  variations  and 
combinations  of  the  "savings"  method,  the  "sweep" 
method,  the  "nearest-neighbor"  method,  and  the 
"r-opt"  method. 

Counting  and  the  VRP 

For  hard  integer  programming  problems  such  as  the 
VRP  where  enumerative  techniques  are  the  primary 
tool  for  solving  even  the  smallest  problems 
exactly,  the  number  of  feasible  solutions  is  a  key 
factor  in  algorithmic  performance.     Generally  it  is 
impossible  to  explicitly  enumerate  and  evaluate  all 
the  feasible  solutions  for  all  but  the  smallest 
problems.     In  this  section,  we  discuss  partitions 
and  counting  feasible  solutions  to  the  VRP. 

In  order  to  get  an  idea  of  the  number  of  feasible 
solutions  to  the  VRP,  consider  a  situation  where  m 
vehicles  are  available  to  serve  n  demand  points. 
Each  vehicle  can  make  no  more  than  k  stops  due  to 
capacity  constraints.     The  number  of  feasible 
solutions  gives  an  indication  of  the  complexity  of 
these  difficult  combinatorial  programming  problems. 
We  demonstrate  an  approach  for  calculating  this 
number  based  on  a  simple  recursion  formula. 

As  background,  we  briefly  introduce  some  notions 
about  partition  theory.     The  theory  of  partitions 
is  an  area  of  number  theory  which  deals  with  the 
representation  of  integers  as  sums  of  other  inte- 
gers . 

Definition  1:     A  partition  of  a  non-negative  inte- 
ger n  is  a  representation  of  n  as  a  sum  of  positive 
integers,  called  either  summands  or  parts  of  the 
partition.     The  order  of  the  summands  is  irrelevant. 
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The  partitions  of  5  are  5,  4+1,  3+2,  3+1+1, 
2+2+1,   2+1+1+1,  and  1+1+1+1+1. 
Thus,  there  are  seven  partitions  of  5.     We  remark 
that  0  has  one  partition,  the  empty  partition,  and 
that  the  empty  partition  has  no  parts.     An  excel- 
lent source  on  elementary  partition  theory  is 
Andrews   [1] . 

Theorem  1:     Let  P)^(m,n)  denote  the  number  of  par- 
titions of  n  into  exactly  m  parts,  each  of  which 
does  not  exceed  k.  Then 


Pj^  (m,n) 


rPj^_.j^  {m,n) +Pj^  (m-1  ,n-k)  for  k>l,l<m£n 
1  for  k^l,m=n=0 


0 


otherwise . 


Proof:     The  partitions  of  n  into  m  parts  with  no 
part  exceeding  k  can  be  divided  according  to 
whether  or  not  k  is  a  summand.     If  it  is  not, 
then  there  are  P^-i (m,n)  partitions.     If  it  is  a 
summand,  then  the  remaining  sum  is  n-k,  and  we 
divide  into  m-1  parts  not  exceeding  k,  so 
Pj^(m-l,n-k)  describes  the  appropriate  number  of 
such  partitions.     These  events  are  obviously 
mutually  exclusive  and  collectively  exhaustive, 
and  the  recursion  is  obtained.     Boundary  condi- 
tions are  as  indicated. 


when  n  =  10,  there  are  3.6  x  10  tours;  when  n  =  50, 
3.04  X  10^4  tours;  when  n  =  100,   9.3  x  lO^^V  tours. 

We  can  see  how  combinatorially  explosive  TSP ' s  can 
be.  From  Theorem  2  we  learn  that  for  a  problem 
with  n  demand  nodes  the  VRP  has  many  more  feasible 
solutions.  Suppose  we  have  a  situation  with  6  de- 
mand nodes,  4  vehicles,  and  a  capacity  of  4  units. 
There  are  a  total  of  6!  =  720  TSP  tours.  The 
eleven  partitions  of  6  are  shown  below: 

6 

5+1 

4  +  2 

4+1  +  1 

3+3 

3  +  2  +  1 

3  +  1  +  1  +  1 

2  +  2+2 

2  +  2  +  1  +  1 

2+1+1  +  1  +  1 

1  +  1  +  1  +  1  +  1  +  1. 

4  4 
Since  (i)^(4,6)  =  6!  E  P^(Sl,6)   and    E   P^  (?,,6)  =  7  the 

Z=l  1=1 
number  of  feasible  solutions  to  the  VRP  is  7!  =  5040. 
In  general,  we  can  determine  <t)j^(m,n)  easily  after 
first  computing  P^^d  ,n)   for  £  =  l,2,...,m  via 
Theorem  1. 


Definition  2_:     Let  ())]^(m,n)  be  the  number  of  fea- 
sible solutions  to  the  VRP  with  m  vehicles,  n 
demand  points,  and  a  delivery  capacity  of  k  demand 
points  for  each  vehicle.     We  assume  vehicles  are 
indistinguishable . 

Definition  3:   Let  S}-(m,n)  be  the  set  of  all  sets 
s  =  {p-,^  ,P2  '  •  •  •  'Pm^  such  that  pj^  +  P2   ...  +  Pm  = 
and  0  <_  p.  _<  k. 

m 

Theorem,  2:     (|)j.(m,n)  =  n!     E  P]^(£,n) 

S,=  l 

Proof:     (t)^(m,n)  =  E         (p  p  ? .  .  p  '  Pi 'P2  '  "  "  "Pn ' 

seS]^(m,n)      12  m 

since  we  must  first  distribute  the  n  demand  points 
among  the  m  vehicles   (we  need  not  use  all  the 
vehicles)  and  then  we  must  order  the  points  for 
each  vehicle.     The  value  of  pj^  is  the  number  of 
demand  points  assigned  to  vehicle  i.  So, 


t))^{m,n)  = 


S£Sk(m,n)    ^p-^lPs!  .  •  -Pj^' 


(P^!P2! 


1  ) 


n!   •  {the  number  of  partitions  of 
the  integer  n  into  at  most  m 
parts  that  have  no  part  ex- 
ceeding k} 


m 

n!  E  P 

£=1  ^ 


(£,n) . 


The  Traveling  Salesman  Problem  is  a  special  case 
of  the  VRP  with  m  =  1  and  k  =  n.     (|)j^(l,n)  is 
given  by  (()n(l,n)  =  n!Pn(l,n)  =  nl     For  example, 


The  Capacitated  Chinese  Postman  Problem 

The  problem  of  finding  an  optimal  route  for  a  single 
vehicle  over  a  network  is  a  common  and,  as  we  have 
seen,  very  difficult  combinatorial  problem.  Rout- 
ing problems  can  be  classified  as  node  routing 
problems,  arc  routing  problems,  or  general  routing 
problems.     Bodin   [5]  provides  a  convenient  taxonomy 
for  these  problems.     The  problem  of  visiting  all 
nodes  in  a  network  in  the  minimal  amount  of  time 
(node  routing)   is  the  classical  TSP.     The  problem 
of  covering  all  arcs  of  a  network  while  minimizing 
the  total  distance  traveled  (arc  routing)  is  the 
Chinese  Postman  Problem  (CPP) .     The  General  Rout- 
ing Problem   (GRP)   on  network  G  =  G(N;A)    (N  is  the 
set  of  all  nodes,  A  the  set  of  all  arcs)   is  a  gen- 
eralization which  includes  the  TSP  and  the  CPP  as 
special  cases.     Here  we  seek  the  minimum  cost  cycle 
which  visits  every  node  in  subset  (J  £  N  and  covers 
every  arc  in  subset  R  c_  A.     This  paper  is  primarily 
concerned  with  node  routing  problems.     Orloff  intro- 
duces the  GRP  and  later  extends  it  to  handle  more 
realistic  problem  situations    [27] ,    [28] ,    [29] . 

Applications  of  arc  routing  problems  include  rout- 
ing of  street  sweepers,  snow  plows,  household  re- 
fuse collection  vehicles,  postmen,  the  spraying  of 
roads  with  salt-grit  to  prevent  ice  formation,  the 
inspection  of  electric  power  lines,  gas,  or  oil 
pipelines  for  faults,  etc.     Since  in  many  vehicle 
routing  situations  customers  are  so  numerous  that 
identifying  them  individually  becomes  cumbersome, 
we  can  consider  the  CPP  to  be  the  continuous 
counterpart  to  the  discrete  TSP.     Here  an  arc 
replaces  a  number  of  customers. 

In  the  CPP  we  assume  that  all  arcs  are  undirected. 
Strieker  [36]  and  Christofides   [5]  consider  an 
extension  which  we  will  call  "the  Capacitated 
Chinese  Postman  Problem"    (CCPP) ,  which  reflects 
real-life  situations  more  directly.     We  are  given 
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arc  demands  Q  =   [qij]  which  must  be  satisfied  by 
vehicles  of  capacity  W.     A  number  of  cycles  must 
be  formed  which  traverse  every  arc,  satisfy  de- 
mands, and  yield  minimum  distance.     This  is  an  arc 
routing  version  of  the  VRP.     Strieker  and  Christo- 
fides  have  suggested  heuristic  procedures  for  ob- 
taining reasonable  solutions  to  this  problem.  How- 
ever, an  optimal  solution  appears  to  be  very  dif- 
ficult to  obtain  for  all  but  the  most  trivial 
problems.     We  give  an  integer  programming  formula- 
tion for  the  CCPP  below  which  indicates  the  inher- 
ent complexity  of  this  problem, 
n      n  NP 


Minimize 


I         I         I  CijX^j 

i=l  j=i  p=i 


(11) 


subject  to      Z  -     E  x^,    =  0  for  i=l,...,n 

ik 


k=l  ki 


Minimize        y  c-x. 

U  ' ' 

subject  to  t 

Z     a .  .X .  >  1  for  all  i 
j=l  '  - 

Xje{0,l}  for  all  j 

where      Cj  =  the  distance  associated  with  tour  j 

_fl  if  feasible  tour  j  is  used 
j     lO  otherwise 


p=l,...NP  (12) 


fl  if  arc  i  is  covered  in  feasible 
I*    tour  j 
[o_  otherwise 

the  number  of  feasible  tours. 


NP 

Z   (x^    +  X..)   >  1     for  all  (i,j)eA 
p=l  (13) 


NP 

Z 

p=l 


r   )  = 
31 


for  all  (i,j)£A 

(14) 


x^..   >  for  all    (i,j)£A  and 


p  =  1 , . . . ,  NP 


(15) 


Z       Z  S,^  q-  .   <  W     for  p=l,  NP  (16) 


i=l  j=l 


x? .  >  0  and  integer 
1]  - 


where    NP  =  the  number  of  available  postmen 


(17) 


(18) 


As  with  the  VRP,  the  number  of  feasible  tours  can 
be  enormous.     The  CCPP  seems  to  belong  to  the  same 
problem  class  as  the  TSP  and  the  VRP.     Any  optimal 
algorithm  for  its  solution  will  most  likely  be  ex- 
ponential. 

Christofides   [6]  presents  a  heuristic  algorithm 
which  appears  to  be  efficient  and  effective.  Com- 
putational results  are  presented  for  graphs  with 
up  to  50  nodes  and  125  arcs.     Running  times  are 
less  than  30  seconds  and  the  percent  deviation  from 
a  lower  bound  on  the  solution  is  less  than  nine  in 
all  cases.     Details  of  the  algorithm  are  clarified 
and  an  example  is  worked  through  in  Christofides 
[6]. 

We  stress  the  fact  that  the  CPP  and  CCPP  are  the 
arc  routing  counterparts  to  the  TSP  and  VRP .  A 
primary  conclusion  is  that  the  CCPP  is,  indeed,  a 
practical  problem  which  deserves  much  more  research 
attention. 

Vehicle  Scheduling 


the  number  of  times  arc  {i,j)  is 
traversed  by  postman  p 


P  fl  i 
ij  1p  o 


f  postman  p  services  arc  (i,j) 
otherwise 


q£j=  the  demand  on  arc  (i,j) 

W  =  the  vehicle  capacity. 

The  objective  function   (11)   seeks  to  minimize  dis- 
tance traveled.     Equations   (12)   ensure  route 
continuity.     By   (13) ,  every  arc  is  covered  at 
least  once.     Equations   (14)   state  that  each  arc 
is  serviced  exactly  one  time.     Equations  (15) 
guarantee  that  arc   (i,j)   can  be  serviced  by  post- 
man p  only  if  he  covers  arc   (i,j).     Every  demand 
is  satisfied,  by  equations   (16) .     For  the  small 
example  provided  by  Christofides   [5]  with  12  nodes, 
22  arcs,  and  5  postmen,  the  above  formulation  re- 
quires 329  constraints  and  440  integer  variables. 

Earlier  in  this  paper,  we  mentioned  that  Balinski 
and  Quandt  formulated  the  VRP  as  a  set  partition- 
ing problem.     Similarly,  we  can  view  the  CCPP  as 
the  following  set  covering  problem: 


We  note  that  any  logistics  system  must  be  composed 
of  both  routing  and  scheduling  components.  Al- 
though the  focus  of  this  paper  is  on  the  routing 
aspect   (the  time  dimension  has  so  far  been  ignored), 
we  now  develop  a  framework  for  attacking  the  two 
problems  simultaneously.     The  problems  have  been 
treated  sequentially  in  Golden   [18]   in  terms  of  the 
well-known  cutting  stock  problem. 

An  alternate,  and  as  yet  unexplored,  means  of  con- 
necting the  routing  and  scheduling  components  is 
through  what  Simpson   [35]   refers  to  as  a  schedule 
map.     Whereas  a  route  map  considers  a  spatial,  geo- 
graphical network,  the  schedule  map  includes  a  time 
dimension  as  well.     Simpson  reports  on  models  for 
public  transportation  systems,  such  as  airline  sys- 
tems, for  which  this  distinction  is  valuable.  In 
the  figures  below  this  distinction  is  illustrated. 


Figure  la :     Route  map 
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Figure  lb:     Schedule  map 


In  the  schedule  map,  each  node  represents  a  geo- 
graphical location  and  a  specific  time.     We  dis- 
cretize  the  time  axis   by  dividing  the  basic  unit 
(perhaps  a  workday)   into  a  number  of  smaller  units 
or  periods.     We  can  imagine,  in  the  schedule  map 
above,  an  eight-hour  workday  broken  into  hour 
units.     The  time  points  correspond  to  9:30,  10:30, 
11:30,  4:30.     The  distances  on  the  route  map 

stand  for  travel  times  in  terms  of  these  hour 
units.     For  example,  the  travel  time  from  A  to  B 
is  approximately  one  hour,  from  B  to  C  takes  ap- 
proximately two  hours,  and  so  on.     The  number  of 
time  points  for  each  location   (in  this  case,  eight) 
is  chosen  such  that  the  travel  times  are  appropri- 
ate.    Note  that  in  the  schedule  map  presented,  we 
have  split  location  A  into  two  locations  A'  and 
A"  in  order  to  make  the  network  more  readable. 

Once  we  have  constructed  a  schedule  map  for  the 
VRP  we  can  proceed  to  route  and  schedule  simul- 
taneously by  applying  a  sequential  Clarke-Wright 
algorithm.     Savings  may  be  defined  in  terms  of 
travel  times  between  nodes,  or  in  terms  of  travel 
distances.     When  a  node  i  has  just  been  included 
in  a  tour,  we  erase  all  other  nodes  which  cor- 
respond to  the  same  location  but  different  time, 
and  then  find  another  feasible  node  j    (not  yet 
in  a  tour)  which  yields  the  best  savings  s^ ■ . 
By  feasible,  we  mean  that  capacity  constraints 
are  not  violated.     Routes  begin  from  the  origin 
at  various  time  points  in  conformity  with  the 
vehicle  availability  limitations. 

This  model  leads  us  to  a  Generalized  Vehicle 
Routing  Problem   (GVRP) .     In  the  GVRP  we  are 
interested  in  routing  vehicles  over  the  collec- 
tions S]_,...Sj^  of  nodes  from  a  given  origin.  Each 
node  has  a  specific  demand.     We  require  that  at 
least  one  node  in  each  collection        be  serviced 
and  we  seek  to  minimize  total  distance  traveled. 
The  schedule  map  is  one  special  case  of  the  GVRP. 
This  general  problem  has  applications  in  many  other 
routing  situations  where,  for  instance,  there  are 
a  number  of  control  offices  or  warehouses  in  each 
small  area  of  a  much  larger  region  which  must  be 
serviced. 

The  difficulty  with  a  schedule  map  is  evident. 


Namely,  the  number  of  nodes  and  arcs  to  consider 
grows  very  rapidly  once  the  time  dimension  is 
depicted.     In  any  case,  the  model  itself  is  helpful 
for  understanding  the  complex  relationship  between 
the  routing  and  scheduling  components  of  a  logis- 
tics system. 

Conclusion 

In  this  paper  we  have  provided  a  concise  overview 
of  vehicle  routing.     The  enormous  number  of  fea- 
sible solutions  to  these  integer  programming  pro- 
blems becomes  a  factor  in  algorithmic  performance. 
We  demonstrate  how  one  might  determine  this  number. 
Two  new  directions  in  vehicle  routing  research  have 
been  proposed:     the  first  is  an  arc  routing  problem 
intimately  related  to  the  VRP--the  Capacited 
Chinese  Postman  Problem;  the  second  is  a  framework 
for  envisioning  the  interaction  between  routing  and 
scheduling  via  the  schedule  map — The  Generalized 
Vehicle  Routing  Problem.     There  are,  of  course, 
other  important  new  research  areas  in  vehicle  rout- 
ing such  as  the  evaluation  of  heuristic  algorithms 
which  we  have  not  discussed  here. 

In  brief,  the  indication  is  that  some  of  the  sug- 
gested procedures  can  be  used  as  effective  deci- 
sion-making tools  for  large-scale  Vehicle  Routing 
Problems  encountered  in  many  practical  situations 
in  both  the  public  and  private  sectors. 
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Abstract 

Since  the  introduction  of  the  additive  algorithm 
by  Egon  Balas  in  1965  [1],  numerous  investigators 
have  added  refinements  and  improvements     to  the 
basic  procedure  to  improve  the  convergence  proper- 
ties of  the  algorithm.     These  embellishments  have 
included  procedures  for  changing  the  search  origin, 
altering  the  criteria  for  augmenting  variables, 
etc.,  and  have   included  the  use  of  composite  con- 
straints for  fathoming  partial  solutions  early  in 
the  augmentation  phase  of  the  algorithm.     In  this 
paper,  we  will  present  computational  experience 
with  many  of  these  refinements  for  the  binary  as 
well  as  the  bounded  integer  problem. 

Introduction 

A  zero-one  integer  linear  programming  problem  can 
be  written  in  the  form: 

minimize:        cx  (1) 
subject  to:    b  +  Ax  s  0 
X.  e  [0,1} 

c.  s  0 
J 

where  c  and  x  are  n- tuples,  b  is  an  m- tuple,  and  A 
is  an  m  X  n  matrix.     Form  (1)  is  termed  the  stand- 
ard  form.    Any  zero-one  programming  problem  can 
be  written  in  form  (1)  by  a  suitable  definition  of 
variables  and  manipulation  of  constraints. 

A  bounded  integer  linear  programming  problem  can 
be  written  in  the  form 

minimize :        cx  (2) 
subject  to:    b  +  Ax  2  0 

0  s  X  5  d 

x^  integer 

c.  ar  0 
J 

where  d  is  an  n-tuple,  and  the  other  variables 
are  as  defined  in  (1)  above.     Suitable  trans- 
formations of  variables  and  constraints  can 
convert  (if  necessary)  any  bounded  integer  linear 
programming  problem  into  form  (2). 

Procedures  effective  in  solving  (1)  can  be  used 
in  solving  (2)  through  a  binary  expansion  of  the 
integer  variables.      Similarly,  (1)  is  a  special 
case  of  (2)  in  which  d.=l,  j=l,2,  n.  There 

exists  widespread  interest  in  obtaining  solutions 


to  (1)  and  (2)  because  of  the  wide  variety  of  prob- 
lems which  can  be  formulated  as  zero-one  and 
bounded  integer  programming  problems  (see,  for 
example,   [91,   [211,   [22],  [26]). 

In  this  paper,  we  will  evaluate  several  extensions 
to  the  basic  Balasian  algorithm  for  solving  four 
classes  of  integer  programming  problems.  These 
classes  include  capital  budgeting  problems  of  the 
Lorie-Savage  variety  [21],  single-constraint  allo- 
cation problems  from  Trauth  and  Woolsey  [27], 
fixed-charge  or  bounded  integer  programming  prob- 
lems from  Haldi  [14],  and  "IBM  test"  problems 
also  from  Haldi  and  from  Trauth  and  Woolsey.  We 
will  not  concern  ourselves  in  this  paper  with 
branch  and  bound  procedures  of  the  Land  and  Doig 
variety  for  solving  the  integer  programming  proD- 
lem,  as  these  methods  represent  fundamentally 
different  approaches.     Computational  results  repor- 
ted do,  however,  present  the  opportunity  for  com- 
parisons between  approaches  (depth  first  search 
vs.  best  first  search). 

Many  investigators  who  have  suggested  refinements 
to  the  Balasian  enumeration  algorithm  have  also 
made  available  FORTRAN  computer  programs  for  tes- 
ting their  refinements  [12],  [15],   [18],  [24], 
[29].    An  empirical  investigation  of  each  of  these 
improvements  remains  to  be  completed.  •'■     In  the 
next  section,   the  computer  programs  and  modifica- 
tions examined  are  briefly  described.    Each  pro- 
gram examined  is  basically  an  enumeration  algorithm 
of  the  Balasian  type  with  certain  improvements 
and  modifications  to  eliminate  large  portions  of 
the  search.     Section  III  discusses  the  computa- 
tional experience  observed  in  solving  categories 
of  problems  with  the  available  techniques.  Sec- 
tion IV  summarizes  the  results  of  the  investiga- 
tion and  discusses  future  improvements  to  enumera- 
tion algorithms  which  appear  promising. 

Computer  Codes  and  Msdif ications  Examined 

In  this  section,  each  of  the  computer  codes 
(techniques)   investigated  is  briefly  described. 
Each  of  the  codes  implements  an  enumerative  search 
procedure  which  is  a  basic  Balasian  algorithm 
with  modifications  designed  to  exclude  portions  of 
the  search  (implicit  enumeration).     Each  of  the 
codes  is  redimensioned  (upward)  from  their  ori- 
ginal versions  to  fit  into  280  K  bytes  of  core. 
Each  code  is  written  entirely  in  the  FORTRAN 
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language  and  is  an  "in-core"  program.  Distin- 
guishing features  or  characteristics  of  each  of 
the  four  codes  examined  are  summarized  in  Table  1. 


The  strategy  choices  made  by  the  user  concern 
the  frequency  with  which  a  surrogate  constraint 
Ls  computed  and  the  maximum  number  of  surrogate 


Code 
CharAcceristlcs 


Language 


Computer 
(IBM) 


Mode  of  Arrays 


Maximum  Problem 
Size,  m  X 
(280  K  bytea  ) 


Imbedded  L.P. 
Present 


Particular 
Variation 
Examined 


'fsble  1 

Computer  Codes  and  Modifications  Examined 


FORTRAN  V 
(or  IV) 


370/168 


FORTRAN  V 
(or  IV) 


FORTRAN  V 
(or  IV) 


E NUMBERS 


FORTRAN  V 
(or  IV) 


370/168 


Integer 


140  X  140 


Revised  Simplex 
(with  Explicit 
Inverse) 


1)  Imbedded  L.P. 
called  every 
time, 

2)  One  surrogate 
constraint 
carried 

3)  L.P.  start  used. 


200  X  200 


1)  Branching  was 
determined  after 
preferred  varia- 
bles were  selec- 
ted. 

2)  Full  cancella- 
tion test  was 
used. 

3)  Starting  solu- 
tion always 
empty. 


370/168 


Real  (with  some 
double  precision) 


IXial  Simplex 


1)  Branching  was 
determined  after 
preferred  varia- 
bles were  selec 
ted. 

L.P.  applied 
after  cancella- 
tion test. 
L.P,  (roundup) 
determined  ini- 
tial origin 
4)  One  restart 
used  at  first 
solution. 


370/168 


Integer 


200  X  200 


FORTRAN  V 
(or  IV) 


370/168 


140  X  140 


2) 


3) 


1)  More  than  one 
variable  could 
enter  the  basis 
in  an  iteration. 

2)  Starting  solu- 
tion was  provi- 
ded by  variable- 
values  which 
satisfied  the 
sum  of  all 
constraints. 


Revised  Simplex 


1)  Bound  redefinition 
employed. 

2)  L.P  called  at 
every  iteration. 

3)  Starting  solution 
always  empty. 

4)  Both  Minlmua  Branch 
and  Baias'  Augmeo- 
tation  rules  inves 
tigaced 


Obviously,  there 


RIP30C  (Geoffrion  and  Nelson  [12]) 

The  code  of  Geoffrion  and  Nelson  employs  a  basic 
Balasian  algorithm  and  an  imbedded  L.P.   to  pro- 
vide an  initial  partial  solution  and  to  compute 
a    strongest"  surrogate  constraint  (as  defined 
by  Geoffrion  [9]).     The  major  advantage  of  the 
particular  surrogate  constraint  introduced  by 
Geoffrion  and  incorporated  into  RIP30C  is  that 
the  dual  of  the  required  linear  program  coin- 
cides exactly  with  the  continuous  version  of 
the  zero-one  problem  in  the  free  variables. 
Hence,  if  in  computing  the  composite  constraint 
the  partial  solution  is  not  fathomed,  and  if 
the  dual  variables  are  integers  (0  or  1)  an 
optimal  completion  of  the  partial  solution  has 
been  found  and  backtracking  can  begin  in  the 
next  iteration.     In  the  computational  exper- 
ience quoted  by  Geoffrion,   it  is  stated  that 
the    use  of  the  imbedded  L.P.  greatly  reduced 
solution  times  in  virtually  every  case."  This 
IS  opposed  to  the  use  of  the  algorithm  without 
the  advantage  of  the  imbedded  L.  P. 

The  program  is  written  in  general  form  and  makes 
no  advantage  of  special  problem  structures.  At 
each  Iteration  a  simple  test  for  binary  infeasi- 
bility  and  for  conditional  binary  infeasibility 
is    performed,  with  each  constraint  being  consi- 
dered individually. 


constraints  carried.     The  L.P.   routine  used  employ, 
the  revised  simplex  method  with  explicit  inverse 
by  Clasen  [4]. 

The  only  difficulty  encountered  in  using  this 
code  concerned  the  value  of  ZKBAR  input  (0).  The 
program  looks  only  for  feasible  solutions  with 
value  at  least  ZKBAR  +  .99999  less  than  the  best 
feasible  solution.     However,  on  the  IBM  370/168 
computer,  the  code  originally  did  not  find  the 
optimal  solution  on  many  problems,  but  terminated 
at  a  solution  which  was  always  1  greater  than  the 
optimal  solution     (all  c^'s  being  integer).  This 

difficulty  was  easily  remedied  by  setting  ZKBAR 
equal  to  -0.5  on  the  input  card.     This  difficulty 
is  felt  to  be  machine  dependent. 

Two  versions  of  RIP30C  were  investigated:     one  in 
which  the  variables  were  input  in  the  order  given 
in  the  problem  (RIP30C),  and  the  other  in  which 
the  variables  were  initially  sorted  in  increasing 
order  of  c     (SORTED  RIP30C) .     This  latter  refine- 
ment has  b^en  found  effective  elsewhere,  and  is 
easily  implemented  by  the  user  with  no  modifica- 
tions to  the  existing  code  required. 
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DZIPl  (Lemke  and  Spielberg  [181) 

The  code  of  Lemke  and  Spielberg  employs  a  direct 
search  technique  developed  by  the  authors.     On  each 
forward  step,  one  variable  is  elevated  to  one;  on 
each  backward  step  one  variable  is  "cancelled" 
back  to  0.     Since  the  algorithm  begins  with  all 
variables  at  level  0,  the  algorithm  terminates  (in 
a  formal  sense)  when  all  variables  have  been  "can- 
celled" back  to  level  0. 

At  each  partial  solution,  the  algorithm  resolves 
the  0-1  subproblem  in  the  currently  free  variables. 
A  backward  step  is  taken  whenever  it  can  be  estab- 
lished that  any  permissible  forward  step  leads  to 
an  unremovable  infeasibility  or  to  a  feasible 
solution  not  better  than  the  best  solution  thus 
far  obtained  (ceiling  criteria).     On  forward  steps, 
a  "preferred  set"of  the  k  free  variables  is  con- 
structed by  using  special  types  of  Gomory  cuts 
and  a  process  termed  complete  reduction.  The  pre- 
ferred variable  yielding  the  maximal  Balas  value 
is  then  elevated  to  1.    When  the  subproblem  in 
(k-1)  free  variables  is  resolved, the  last  variable 
to  be  elevated  to  1  (LIFO  procedure)  is  cancelled 
to  level  0,  and  the  subproblem  in  the  remaining 
(k-1)  free  variables  is  again  resolved.  The 
authors  state  that  the  procedure  of  considering 
a  subset  of  the  free  variables,  the  preferred  set, 
reduces  substantially  the  computing  time  required 
and  the  number  of  points  to  be  considered. 

The  code  uses  all  integer  arithmetic  which  has  a 
very  distinct  advantage:     half-word  integers  can 
be  used  for  the  arrays,  resulting  in  a  larger 
problem  residing  in  core  or  primary  memory. 

The  program  performs  a  change  of  variables  (if 

necessary)  from  "problem"  to  "program"  variables 

in  which  0  <,  c.  <.  c.,,.    This,  as  well  as  other 
1  i+l 

elaborations  to  the  basic  algorithm,  has  the 
effect  of  significantly  reducing  computation  time. 
The  authors  state  that  the  code  often  finds  the 
optimal,  or  at  best  a  feasible  solution  rapidly, 
with  a  major  portion  of  the  computing  time  being 
expended  in  the  "clean-up"  phase,  where  the  opti- 
mality  of  the  solution  is  verified. 

DZLP  (Salkin  and  Spielberg  [241) 

The  code  of  Salkin  and  Spielberg  employs  a  direct 
search  technique  which  is  an  elaboration  of  Balas' 
additive  algorithm.     In  addition,  the  code  pro- 
vides for  the  initial  origin  to  be  generated  by  an 
imbedded  L. P. -roundup  start,  and  also  provides  for 
the  search  to  be  restarted  at  improved  zero-one 
solutions.    While  restarting  the  search  neces- 
sarily introduces  enumeration  redundancy,  the 
authors  state  that  "the  length  of  the  search  is 
inversely  proportional  to  the  closeness  of  the 
origin  to  the  minimal  solution"  and  the  redun- 
dancy introduced  by  restarting  may  be  more  than 
compensated  for  computationally  by  the  reduction 
in  time  expended  in  verifying  the  optimality  of 
the  solution.     Computational  results  reported 
empirically  verifies  this  contention  [23].  The 
code  also  contains  infeasibility,  cancellation, 
and  ceiling  criteria  similar  to  that  of  DZIPl. 

The  code  uses  real  arithmetic,  and,  in  addition, 
contains  some  double  precision  arrays  and  varia- 
bles.   The  program  in  the  form  received  required 


a  few  minor  programming  changes  to  run  on  the  IBM 
370/168  computer.     Along  with  a  few  "carriage 
control"  and  other  format  statement  changes,  the 
major  change  involved  correcting  the  program  from 
returning  a  "no  feasible  solution"  termination 
when  the  first  call  to  the  imbedded  dual-simplex 
routine  yielded  the  optimal  solution  to  the  zero-- 
one  integer  problem.     Apart  from  these  minor  prob- 
lems, no  additional  difficulties  were  encountered 
in  using  the  program  in  its  original  form. 

HOLCOMB  (Holcomb  [151) 

The  program  written  by  Holcomb  at  Union  Carbide 
is  a  variant  of  the  Balasian  algorithm  with 
several  heuristic  tests  available  for  influencing 
the  search  strategy.     These  heuristic  tests  re- 
place existing  tests  in  the  algorithm  and  allow 
the  user  to  select  a  procedure  for  conducting  the 
search.     Heuristic  options  available  include  pro- 
visions for  initializing  the  program  with  a  start- 
ing solution  which  satisfies  a  "worst"  constraint 
or  which  satisfies  a  surrogate  constraint  which 
is  the  sum  of  all  constraints.     Other  options 
available  include  a  provision  for  entering  more 
than  one  variable  into  the  basis  in  an  iteration 
and  for  raising  to  1  the  first  eligible  variable 
rather  than  conducting  a  time-consuming  test  to 
determine  the  next  entering  variable.     Two  pro- 
gram strategies  are  available  which  are  used  when- 
ever variations  are  present  in  the  magnitude  of  the 

ratio  c./a.  .  or  whenever  some  constraint  coeffi- 
J  iJ 

cients  a. .  are  such  that  c.   •  a. .  >  0. 

ij  J  3-J 

Best  results  were  obtained  using  this  code  to  solve 
the  included  problems  by  allowing  for  more  than  one 
variable  to  enter  the  basis  in  an  iteration  and 
by  providing  the  program  with  a  starting  solution 
which  satisfied  the  sum  of  all  constraints. 

The  majority  of  the  arithmetic  calculations  per- 
formed by  the  code  are  performed  in  the  fixed- 
point  mode. 

ENUMBER8  (Trotter  [291) 

The  ENUMBER8  computer  program  was  developed  by 
Trotter  and  Shetty  [28]  to  solve  the  bounded 
variable  integer  programming  problem  using  impli- 
cit enumeration  techniques.     Their  approach  repre- 
sents an  extension  to  Geoffrion's  [91  algorithm 
for  the  pure  integer  problem,  employing  simple 
fathoming  tests  based  on  optimality  and  feasibility 
considerations,  and  employing  surrogate  constraints 
similar  to  those  defined  by  Geoffrion.     The  authors 
also  employ  techniques  to  tighten  the  bounds  on 
free  variables  using  the  L.P.  based  procedure 
contained  in  the  algorithm,  the  effects  of  which 
are  to  simultaneously  fathom  several  partial 
solutions  at  a  time. 

As  a  user  option,  the  augmentation  phase  of  the 

algorithm  (explicit  enumeration)  may  be  guided  by 

two  mutually  exclusive  criteria,    (1)  The  maximum 

BALAS  value  is  used  to  select  that  variable  x. 

J 

to  elevate  to  d^  or  its  current  upper  bound.  (2) 

The  variable  x.  which  minimizes  d.  for  all  free 

J  J 
variables  is  fixed  at  its  lowest  permissible  value. 
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The  authors  state  that  "utilization  of  the  bound 
redefinition  capability  appears  to  offer  a  general 
improvement  in  the  performance  of  the  algorithm 
regardless  of  the  augmentation  rule  used,  although 
in  comparing  the  computational  experience  obtained 
by  varying  the  augmentation  rule,  the  results  are 
less  well-defined. 

Several  difficulties  were  encountered  in  using 
the  ENU>IBER8  computer  code  on  the  IBM  computer. 
These  difficulties  arose,  for  the  most  part, 
because  of  0-subscripting  in  the  arrays,  with 
concomitant  problems  with  several  of  the  indices 
in  DO- loops   (being  out  of  range).  Several 
changes  thus  had  to  be  made  in  the  program,  and 
it  is  indeed  possible  that  changes  made  in  order 
to  run  the  program  on  the  IBM  computer  may  have 
affected  the  logic  originally  intended  by  the 
authors.     Unfortunately,   time  did  not  permit  us 
to  flow-chart  this  procedure  to  examine  fully  the 
implications  of  all  changes  made  to  their  pro- 
cedure.    We  have  since  used  E^^JMBER8  on  a  CDC6500 
and  a  UNIVAC  1110  computer,  and  have  not  experi- 
enced the  difficulty  we  encountered  on  the  IBM 
370. 

ENUMBER8  is  used  in  these  experiments  to  assess 
the  efficacy  of  Trotter  and   Shetty '  s  approach  for 
solving  the  pure  binary  problem  (d.=l,  all  j), 
and  for  comparing  their  direct  approach  for 
treating  general  integer  variables  with  a  binary 
procedure  using  a  binary  expansion  to  represent 
the  integer  variables. 


All  codes  were  compiled  into  object  decks  using 
the  FORTRAN  H  compiler  with  optimization  feature 
(OPT  =  2)  in  order  to  obtain  the  lowest  execu- 
tion times  possible.     The  timing  subroutine  for 
each  code  returns  relative  CPU  time  in  milli- 
seconds using  the  "T-timer"  macro  available  on 
the  supervisory  system. 

Computational  Experience 

Capital  Budgeting  Problems 

The  capital  budgeting  problems  given  in  Table  2 
were  orginally  solved  by  Petersen  [21]  in  his 
investigation  of  variants  to  the  basic  Balasian 
algorithm,  some  of  which  were  suggested  in  an 
earlier  paper  by  Glover.     The  problems  vary  in 
size  from  six  variables  and  ten  constraints  to 
fifty  variables  and  five  constraints. 

The  lowest  overall  execution  time  on  these  seven 
problems  was  achieved  by  SORTED  RIP30C,  followed 
by  RIP30C  and  then  by  DZLP.     Both  DZIPl  and 
HOLCOMB  began  experiencing  difficulty  in  solving 
these  types  of  problems  in  the  25-30  variable 
range  and  above.     Apparently,   the  addition  of 
the  L.P.  start  and  the  adaptive  origin  concept  to 
the  ideas  originally  developed  by  Lemke  and  Spiel- 
berg possesses  great  merit  in  solving  these  types 
of  problems  as  is  shown  in  the  computation  results 
for  DZIPl  and  DZLP  in  Table  2.     (DZIPl  and 
HOLCOMB  were  unable  to  solve  the  seventh  and 
largest  capital  budgeting  problem  within  the  time 
limit  specified.)     Sorting  of  problem  variables 
was  also  of  some  importance  in  reducing  execution 


time  for  RIP30C.     Unfortunately,  employing 
ENUMBERB  with  all  d. -values  set  equal  to  1  did  not 
result  in  low  execution  times  for  this  group  of 
problems,  especially  as  the  number  of  variables 
in  a  problem  increased.     This  corroborates  the 
experience  reported  by  Trotter  and  Shetty  [28]  in 
solving  pure  binaryproblems  with  their  procedure. 

Single-Constraint  Allocation  Problems 

Nine  allocation  problems  were  formulated  by 
Trauth  and  Woolsey  to  investigate  the  sensitivity 
of  integer  programming  algorithms  to  minor  changes 
in  the  problem  matrix.     Each  of  the  problems  con- 
sists of  ten  variables.     The  problems  differ  from 
one  another  only  in  the  value  of  b,  the  right 
hand  side.    All  codes  were  able  to  solve  these 
problems  in  less  than  0.10  seconds  per  problem, 
hence  the  results  are  not  presented  in  tabular 
form. 

IBM  Test  Problems  i 

The  IBM  test  problems  are  a  pot-pourri  of  integer  ■ 
programming  problems  which  (except  for  problem  #3) 
feature  matrices  of  O's  and  I's.     Trauth  and 
Woolsey  selected  these  problems  for  examination 
because  of  "the  wide  disparity  of  solution  times 
between  various  approaches  and  because  of  the 
multiplicity  of  optimum  integer  solutions."  DZIPl 
and  HOLCOMB  experienced  difficulty  in  solving  this 
class  of  problems  when  the  number  of  variables 
increased  to  30.     The  ENUMBERS  programming  code 
recorded  its  best  results  when  the  minimum  branch 
rule  was  used  for  variable  augmentation  (these 
integer  problems  were  treated  directly  by  ENUMBERB). 
Overall  best  results  were  recorded  by  the  "unsorted 
version"  of  RIP30C  and  by  DZLP,  as  indicated  in 
Table  3.  J 

IBM  problems  4  and  5  differ  from  each  other  only  I 
in  the  b-vector  (b5  b^) ,  yet  each  of  the  codes  I 
examined  experienced  much  more  difficulty  in  " 
solving  problem  5  than  in  solving  problem  4.  While 
the  difference  in  solution  times  cannot  be  explained  i 
by  the  computational  results  reported,  one  can  con-  f 
jecture  that  constraint  severity  (as  determined  by  1 

m  n  a .  .  J 
(E  (E  ~t^)  ) /™  these  structured  problems  has  1 
i=lj=l  ^  I 

as  pronounced  an  effect  on  solution  times  as  do  ™ 
the  number  of  variables  and/or  the  number  of  con- 
straints.     The  results  reported  in  Table  3  are 
similar  to  those  reported  by  Trauth  and  Woolsey  , 
[27,  p.  491,  Table  V],  except  that  each  of  the 
codes  showed  less  susceptibility  to  the  large  !■ 
number  of  constraints  in  problem  9  than  did  the  ; 
integer  programming  techniques  they  examined. 

Fixed-Charge  Integer  Programming  Problems 

The  fixed-charge  Integer  programming  problems  of  |j 
Haldi  were  formulated  as  zero-one  integer  program- 
ming problems  by  a  binary  expansion  of  the  bounded 
integer  variables  using  the  binary  procedures  j 
examined,  and  were  treated  directly  with  the  " 
ENUMBERB  program.     Each  of  these  problems  features 
special  constraints  which  force  certain  variables 
to  assume  nonzero  values  If  other  variables  take 
on  nonzero  values.     Despite  the  small  size  of  these 
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Table  2 

Solution  Times  for  Petersen's 
Capital  Budgeting  Problems 


Problem 

Number  of 

Number  of 

Solution 

** 

Time 

Number 

0-1  Variables 

Constraints 

SORTED 

ENUMBER8 

ENUMBERS 

RIP30C 

RXP30C 

n7  TP! 

DZLP 

HOLCOMB 

BALAS 

MBR 

1 

6 

10 

0.050 

0.040 

0.077 

0.040 

0.054 

0.054 

0.060 

2 

10 

10 

0.073 

0.077 

0.090 

0.103 

0.067 

0.157 

0.164 

3 

15 

10 

0.174 

0.110 

0.340 

0.130 

0.123 

0.490 

0.507 

4 

20 

10 

0.223 

0.137 

1.727 

0.323 

0.537 

1.457 

1.494 

5 

28 

10 

0.383 

0.257 

26.030 

1.770 

8.407 

5.084 

5.354 

6 

39 

5 

1.706 

0.947 

46.984 

2.337 

35.343 

10.430 

11.464 

7 

50 

5 

3.587 

2.780 

* 

150 

15.917 

* 

150 

29.167 

31.360 

Number  of  Problems  in  Which 

Summary 

Optimal  Solution 
and  Verified 

Was  Found 

7 

7 

6 

7 

6 

7 

7 

Measures 

** 

Average  Solution  Time 
for  Problems  Solved 

.885 

.621 

12.541 

2.946 

7.422 

6.691 

7.200 

Indicates  optimal  solution  was  not  found  wlthinallotted  time  of  150  seconds 
IBM  370/168  CPU  time,  in  seconds 

Indicates  all  c^'s  have  been  multiplied  by  10  in  this  version  of  the  problem  In  order  to  have  all  Cj-values  in 
integer  form(to  make  a  valid  comparison  between  computer  codes) . 


problems  (in  terms  of  the  number  of  original  inte- 
ger variables  and  number  of  constraints) ,  they  are 
quite  often  difficult  to  solve  using  known 
approaches.     Solution  times  for  each  of  the  fixed- 
charge  problems  using  each  of  the  programming 
codes  investigated  are  shown  in  Table  4. 

The  lowest  mean  execution  time  on  this  series  of 
problems  was  recorded  by  DZLP,  followed  by 
ENUMBERS  using  the  minimum  branch  augmentation 
rule,   then  by  SORTED  RIP30C,  RIP30C,  HOLCOMB, 
DZIPl,  and  ENUMBERS  using  BALAS '  rule  for  aug- 
mentation.    The  pure  binary  programming  pro- 
cedures not  employing    any  form  of  linear  pro- 
gramming to  fathom  partial  solutions  generally 
showed  a  larger  increase  in  computation  times 
as  the  number  of  variables  in  a  problem  in- 
creased, than  did  their  LP-based  counterparts. 
Also,  the  ENUMBERS  program  with  the  BALAS  aug- 
mentation rule  was  not  generally  very  effective 
in  solving  these  problems. 

It  is  interesting  to  compare  the  results  in  Table 
4  with  the  integer  programming  results  of  Trauth 
and  Woolsey  [27,  p.  490,  Table  III]  on  this 
series  of  problems.     Problems  1-4  and  7-8  are 
quite  similar,  differing  primarily  in  the  value 
of  the  b-vector  and  the  value  of  the  "fixed- 
charge."    While  the  codes  RIP30C,  DZIPl  and, 


HOLCOMB  each  record  similar  computation  times  for 
problems  within  each  of  these  two  groups  and  a 
marked  difference  in  computing  times  between  the 
two  groups,  all  were  able  to  solve  the  problems 
in  a  reasonable  amount  of  time.     Of  the  five 
integer  programming  techniques  examined  by  Trauth 
and  Woolsey,  only  one  was  able  to  solve  all  of 
these  problems  within  the  number  of  iterations 
allowed.       Considering  the  difference  in  operating 
speeds  of  the  computers  used,  it  would  appear  that 
problems  1-4  were  solved  in  less  time  by  integer 
programming.     However,  all  binary  programming 
codes  were  able  to  solve  each  of  the  fixed-charge 
problems,  and  each  showed  far  less  variability 
in  solution  time  between  the  two  groups  of  prob- 
lems,  1-4  and  7-8. 

In  order  to  compare  the  direct  approach  of 
Trotter  and  Shetty     with  a  binary  expansion  of 
the  integer  variables  and  a  pure  zero-one  approach, 
the  fixed  charges  and  the  right  hand  sides  for  the 
problems  listed  in  Table  4  were  multiplied  by  10 
and  then  by  100.     For  the  binary  procedures,  this 
resulted  in  a  maximum  problem  size  of  71  variables; 
for  the  integer  approach,   this  simply  resulted  in 
increasing  the  bounds  on  the  variables,  d.. 
Summairy  results  on  these  larger  problems  Ire 
presented  in  Table  5.     Due  to  the  superiority  of 
the  minimum  branch  rule  in  the  ENUMBERS  program 
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Table  3 
Solution  Times  for  lEM 
Test  Problems  of  Haldl 


Problem 

Number  of 

Number  of 

Solution  Time** 

NuTTiber 

LiULlo  LI-d-LLII-s 

SORTED 

ENUMBER3 

ENUMBER8 

RIP30C 

R1P30C 

DZIPl 

DZLP 

HOLCOMB 

BALAS 

MBR 

1 

21 

7 

0.294 

0.466 

0.304 

0.186 

0.407 

0.320 

0.226 

2 

21 

7 

0.280 

0.373 

0.594 

0.213 

0.353 

0.647 

0.420 

3 

20 

3 

0.064 

0.106 

0.697 

0.056 

0.090 

0.784 

0.603 

4 

30 

15 

2.947 

4.150 

36.170 

0.213 

102.437 

2.384 

0.903 

5 

30 

15 

11.286 

12.154 

150 

14.283 

150* 

27.440 

10.133 

■fHHe 

9 

15 

50 

3.890 

4.052 

1.604 

4.223 

1.127 

28.830 

11.656 

Number  of  Problems  in  Which 

Summary 

Optimal  Solution  Was  Found 

6 

6 

5 

6 

5 

6 

6 

and  Verified 

Measures 

** 

Average  Solution  Time 
for  Problems  Solved 

3.127 

3.550 

7.874 

3.196 

20.883 

10.068 

3.990 

Indicates  optimal  solution  was  not  found  within  time  limit  of  150  seconds 
^IBM  370/168  CPU  time,   in  seconds. 

If  one  solves  the  version  of  this  problem  as  reported  by  Haldi  and  by  Trauth  and  Woolsey,  the  optimal 
solution  is  Z=8.    If,  however,  a^        =  1  and  a^^    10  ~       ^^^^         solution  reported  by  Haldl  and  by 

Trauth  and  Woolsey  (Z  =  9)  is  correct. 


in  solving  the  original  version  of  these  problems, 
this  was  the  only  alternative  investigated  for 
these  larger  problems. 

Again  DZLP  recorded  the  lowest  execution  times 
for  both  problem  sizes,  and  in  general,   those  pro- 
cedures not  using  any  form  of  linear  programming 
began  to  experience  significant  difficulty  in 
solving  these  types  of  problems  when  the  number 
of  variables  in  the  problem  increased  beyond  30. 
It  is  also  interesting  that  the  use  of  a  binary 
expansion  for  the  integer  variables  when  used  with 
a  procedure  employing  linear  programming  to 
fathom  partial  solutions  generally  resulted  in 
as  low  if  not  lower  execution  times  for  this 
group  of  problems  than  did  a  direct  treatment  of 
the  integer  variables.     The  differences  in  recor- 
ded execution  times  are,  of  course,  a  function 
of  both  differences  in  computer  programming 
decisions   as  well  as  a  difference   in  algorithmic 
approaches  developed. 

Summary  Observations 

The  computational  experience  described  in  the  pre- 
vious   section  reveals  the  efficacy  of  certain 
improvements  to  the  basic  Balasian  algorithm  as 
implemented  by  computer  programs  supplied  by  the 
authors  of  these  improvements.     Any  evaluation  of 
these  modifications  is  necessarily  confounded  with 
the  programming  decisions  made  to  implement  each 


improvement,  as  well  as  with  the  programming  deci- 
sions made  regarding  the  coding  of  the  basic 
algorithm.     Nevertheless,  the  data  does  indicate 
that  certain  improvements  do  indeed  accelerate 
convergence  on  certain  categories  of  problems. 
The  data  further  provides  direction  for  the  user 
of  these  techniques  for  solving  certain  zero-one 
and  integer  programming  problems  by  existing,  non- 
proprietary computer  programs,^  and  provides  a 
yardstick  for  assessing  the  relative  worth  of 
proprietary  programs  based  upon  their  cost.  Addi- 
tionally,  the  data  suggests  certain  avenues  which 
might  be  explored  in  developing  future  improve- 
ments to  known  enumeration  approaches,  some  of 
which  have  already  appeared  in  the  open  literature. 

Improvements  to  Existing  Techniques 

With  the  exception  of  perhaps  the  allocation  prob- 
lems,  it  appears  that  any  reduction  in  computation 
time  which  could  be  achieved  through  the  use  of 
integer  arithmetic  is  more  than  offset  by  the  use 
of  real  arrays  used  in  conjunction  with  an  L.P. 
routine  to  accelerate  convergence.     This  is  evi- 
denced by  the  lower  overall  computation  times 
recorded  by  SORTED  RIP30C,  RIP30C  and  DZLP.  It 
would  appear,   therefore,  that  any  improvement  made 
in  the  method  of  solving  the  imbedded  linear  pro- 
gram would  have  the  effect  of  significantly  reduc- 
ing solution  times.     Geoffrion  and  Marsten  Til] 
discuss  an  improved  RIP30C^  which  incorporates 
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Table  4 

Solution  Times  for  Fixed  Charge  Problems  of  Haldl 


Problem 

Number  of 

Number  of 

Solution 

* 

Time 

Number 

0-1  Variables 

Constraints 

RIP30C 

SORTED 
RIP30C 

DZIPl 

DZLP 

HOLCOMB 

ENUMBER8 
BALAS 

ENUMBER8 
MBR 

1 

11 

4 

0.110 

0.103 

0.090 

0.063 

0.067 

0.070 

0.044 

2 

12 

4 

0.110 

0.137 

0.207 

0.053 

0.053 

0.167 

0.080 

3 

14 

4 

0.150 

0.197 

0.344 

0.100 

0.057 

0.267 

0.114 

4 

12 

4 

0.107 

0.100 

0.166 

0.053 

0.053 

0.314 

0.140 

7 

22 

4 

0.760 

0.440 

0.894 

0.360 

0.510 

7.560 

0.407 

8 

23 

4 

1.173 

0.820 

1.340 

0.340 

0.890 

8.421 

0.527 

9 

15 

6 

0.140 

0.130 

1.454 

0.126 

0.060 

0.907 

0.564 

10 

30 

10 

1.310 

1.783 

4.537 

0.647 

4.970 

10.360 

0.814 

Summary 

Number  of  Problems  In  Which 
Optimal  Solution  Was  Found 
and  Verified 

8 

8 

8 

8 

8 

8 

8 

Measures 

* 

Average  Solution  Time 
for  Problems  Solved 

0.483 

0.464 

1.129 

0.218 

0.833 

3.508 

0.336 

*IBM  370/168  CPU  time,  in  seconds 


Table  5 

* 

Mean  Solution  Times    for  Larger 
Haldi  Fixed  Charge  Problems 


Modification 
to  Original 
Problem 

* 

Mean  Solution  Time 

RIP30C 

SORTED 
RIP30C 

DZIPl 

DZLP 

HOLCOMB 

ENUMBER8 
MBR 

b  .  and  F . . 

multiplied 
by  10 

1.320 

1.374 

6.611 

0.840 

7.445 

1.099 

b .  and  F . . 

multiplied 
by  100 

6.328 

2.836 

35.018 

2.775 

** 

10.24 

5.366 

IBM  370/168  CPU  time,  in  seconds 

** 

Optimal  solution  found  and  verified  in  only  5  of  8  problems 
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(among  other  changes)  a  dual  linear  programming 
subroutine  with  column  generation  which  is  purpor- 
tedly more  efficient  than  the  explicit  inverse- 
revised  simplex  method  available  in  the  current 
version  of  RIP30C^.     Given  the  current  state-of- 
the-art  in  enumeration  programming,   it  seems  safe 
to  conclude  that  any  improved  program  will  neces- 
sarily incorporate  some  type  of  an  imbedded  L.P. 
routine,  although  the  choice  of  the  actual  L.P. 
routine  is  more  open  to  question.     It  is  worth 
mentioning  in  this  connection  that  the  problem 
which  led  to  the  discovery  that  DZLP  may  on 
occasion  return  a  "no  feasible  solution"  when  the 
first  call  to  the  L.P.  returns  the  optimal  zero- 
one  solution  was  solved  in  substantially  less  time 
by  RIP30C  with  the  Clasen  L.P.  routine  than  by 
DZLP. 

In  general,  solution  times  were  lower  for  the  pro- 
grams which  use  a  starting  solution  determined  by 
solving  the  continuous  version  of  the  zero-one 
problem  by  linear  programming  and  then  rounding 
this  solution.     Subsequent  testing  with  both  RIP- 
30c  and  DZLP  on  the  Petersen  capital  budgeting 
problems  using  a  null  starting  vector  and  a  round- 
ed L.P.  solution  revealed  the  general  efficacy  of 
initializing  the  implicit  enumeration  algorithm 
with  other  than  a  vector  of  zeroes.     This  would 
tend  to  substantiate  Salkin's  [23]  contention 
that  a  good  initial  solution  can  significantly 
reduce  the  length  of  the  search  and  would  further 
suggest  that  other  initializing  schemes  be 
examined.     Byrne  and  Prall  [3]  have  reported  suc- 
cess in  this  area,  and  the  use  of  a  heuristic 
starting  procedure  such  as  the  "effective  gra- 
dient" procedure  of  Senju  and  Toyoda  [25]  also 
investigated  by  Wyman  [31]  shows  promise  both  for 
initializing  the  implicit  enumeration  algorithm 
and  for  providing  a  good  initial  bound  on  the 
objective  function  value  for  capital  budgeting 
type  problems. 

Although  less  convincing  than  the  general  efficacy 
of  employing  LP-based  routines,   the  data  for  the 
problems  examined  tend  to  indicate  that  the  use  of 
a  binary  expansion  for  the  integer  variables  may 
be  as  effective  as  treating  the  integer  variables 
directly,  as  least  within  the  context  of  impli- 
cit enumeration.     Naturally,  the  differences  in 
results  reported  is  confounded  with  differences  in 
programming  decisions  made  to  implement  each  of 
the  algorithms  described. 

Special  Problem  Structures 

Integer  and  binary  programming  problems  with 
"special  structures"  are  currently  being  solved 
in  a  modest  amount  of  computation  time  for  prob- 
lems involving  a  few  hundred  variables.  Thangavelu 
and  Shetty  [26],   for  example,  have  developed  a 
special  purpose  enumeration  procedure  for  solving 
a  binary  formulation  of  the  assembly  line  balancing 
problem,  and  quote  impressive  results.     Their  pro- 
cedure has  been  generalized  to  the  project  and  job- 
shop  sequencing  problem  by  Patterson  and  Roth  [20], 
and  again  good  results  have  been  reported.  Thus, 
it  is  possible  that  special  purpose  procedures 
will,   in  the  future,   replace  general  purpose 
algorithms  on  special  types  of  problems  as  more 
knoweldge  is  gained  in  solving  these  special 
structure  problems. 


Predicting  the  Time  Required  for  Problem  Solution 

One  of  the  more  important  formulations  (by  appli- 
cation)of  binary  programming  is  the  capital 
budgeting  problem.     Although  these  types  of  prob- 
lems do  possess  a  fairly  predictable  structure 

(dense  matrices,  positive  a..'s,  etc.)   it     is  not 

ij 

a  structure  which  is  easily  exploited.  Interest 
thus  centers  on  whether  solution  times  can  be 
predicted  as  a  function  of  variables  other  than  n, 
the  number  of  variables  in  a  problem.     The  experi- 
ence gained  in  solving  the  IBM  problems  of  Section 
III  suggests  tkat  constraint  severity  as  measured 
by  the  "amount     of  slack  present  in  a  constraint" 
can  influence  the  time  required  for  problem  solu- 
tion . 

Nine  sets  of  ten  capital  budgeting  problems  were 
computer  generated.     Each  set  contained  50,100 
or  175  variables  and  20  constraints  (this  would 
correspond  to  a  planning  horizon  of  5  years  of 
4  quarters  each,  or  20  years  of  one  year  each, 
etc.)     Coefficients  for  the  problems  were  generated 
randomly  such  that  problems  within  a  set  had 
similar  A  matrices,  but  between  sets  differed  in 
the  variability  in  c.  and  b.  values.     The  problem 
generator  is  more  fully  described  in  Ference  [6]. 
An  attempt  was  then  made  to  solve  the  generated 
problems  with  the  sorted  version  of  RIP30C. 
These  results  are  summarized  in  Table  6.  In 
general,   those  problems  possessing  the  higher 
variability  in  b.  values   (for  similar  A-matrices) 
were  solved  in  significantly  less  computation 
time  than  those  with  less  variability.  Subsequent 
experiments  with  capital  budgeting  problems  with 
11-75  variables  indicated  the  most  significant 
variable  in  predicting  solution  time  over  the 
range  of  problem  sizes  examined  was  a  variance 
measure  of  constraint  severity  given  by 


(S  ((  Z    !ii_)  -  X  SEV)    )  ;  m 
i=l    j=l  b. 

where    X  SEV  = 
m  n 

(E  (  E    lil))  ^  n> 
i=l  j=l  b. 

The  R-squared  values  in  the  regression  models 
developed  were  on  the  order  of  0.88,  indicating 
it  is  possible  to  identify  classes  of  and  charac- 
teristics of  problems  which  are  likely  amenable 
to  solution  with  implicit  enumeration  techniques. 
Conclusion 

Pure  binary  and  integer  programming  problems  were 
input  to  five  different  computer  codes,  each  of 
which  incorporates  various  modifications  and 
improvements  to  the  basic  Balasian  algorithm. 
The  overall  conclusion  is  that  the  surrogate  con- 
straint concept  developed  by  Geoffrion  and  pro- 
grammed into  RIP30C  and  the  dynamic  origin  con- 
cept of  Salkin  and  Spielberg  are  the  most  effec- 
tive of  the  improvements  examined,^  although  other 
improvements  were  often  effective  on  some  problems. 
The  effects  of  determining  good  initializing 
solutions  were  discussed,     and  suggestions  for 
incorporating     other     improvements  into  zero- 
one  integer  programming  routines  were  made.  The 
results  reported  tend  to  show  that  certain 
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Table  6 


Mean  Solution  Times    Required  to  Solve  Randomly  Generated  Capital  Budgeting  Problems 


Number  of  Variables 

50 

100 

175 

Variability  In 

c .  -  values 
J 

Low 

High 

Low 

High 

High 

Variability  In 

b.  -  values 
1 

Low 

High 

Low 

High 

Low 

High 

Low 

High 

High 

* 

Mean  Solution  Time 
Ten  Problems 

37.36 

2.146 

36.49 

1.689 

150 

22.78 

156''' 

9.947 

38.89 

IBM/^70  /168  CPU  time,  in  seconds 

Indicates  no  problems  were  solved  within  time  limit  .>f  150  seconds  per  problem. 


types  of    pure-integer  programming  problems 
may  be  amenable  to  solution  by  zero-one  integer 
programming. 
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the  preparation  of  this  paper.     The  anonymous 
referee  provided  several  suggestions  useful  in 
revising  the  manuscript,  as  well  as  the  following 
detailed  comments: 

1.  The  special  types  of  cuts  employed  in  the 
Lemke- Spielberg  code  (DZIPl),  although  termed 
Gomory  cuts  by  the  authors,  are  not  the  same  as 
those  traditionally  referenced  in  the  literature 
as  Gomory  Cuts,  and  used  in  solving  the  general 
IP  problem  (cutting  planes) . 

2.  The  original  use  of  the  generalized  origin 
technique  is  due  to  Spielberg,  "Plant  Location 
with  Generalized  Search  Origin,"  Management  Science, 
Vol.  16,  No.  3,   (November  1969),  pp.  165-178 

One  final  comment  made  by  the  anonymous  referee 
was  that  the  paper  "probably  is  not  up-to-date  in 
the  sense  that  all  promising  new  approaches  are 
included."    The  intent  of  the  investigation  was  to 
evaluate  various  "depth-first-search"  approaches 
for  solving  the  binary  and  the  bounded  integer 
programming  problems  using  computer  software 
(FORTRAN  programs)  supplied  by  the  originator  of 
the  algorithm  evaluated.     Within  the  restrictions 
of  the  techniques  evaluated  and  the  general 
availability  of  software  for  implementing  these 
techniques,  the  paper '^robably"is  fairly  up-to- 
dat^  although  certain  other  search  strategies  may 
offer  more  promise  in  obtaining  solutions  rapidly. 
Hopefully,   the  computational  experience  reported 
herein  can  be  used  as  a  yardstick  for  assessing 
the"promising  new  approaches"f or  solving 
binary  and  bounded  integer  programming  problems. 


Footnotes 

1.  Investigations  of  the  efficacy  of  various 
mathematical  programming  algorithms  for  other  than 
the  binary  programming  problem  have  recently 
appeared  in  the  open  literature.    A  computational 
investigation  of  the  "pure- integer "  programming 
problem,   for  example,   is  reported  by  Trauth  and 
Woolsey  [27].     Zionts  has  recently  performed 

some  empirical  tests  of  the  criss-cross  method 
of  linear  programming  [32].     And  Braitsch  compares 
four  quadratic  programming  algorithms  in  a  1972 
paper  [2]. 

2.  Each  of  the  programs  examined  herein  is 
available  through  SHARE  or  RAND  Corporation 
at  a  very  modest  (usually  mailing)  cost. 

3.  The  improved  RIP30C  was  not  available  for 
testing. 

4.  Geoffrion  (in  a  private  communication)  reportec 
solving  capital  budgeting  problems  involving  more 
than  two-hundred  variables  with  the  improved 
RIP30C. 

5.  Each  of  the  computer  codes  examined  requires 
that  certain  input  parameters  be  set  to  influence 
the  direction  of  the  search  and  the  search  strategy 
employed.     Hence  any  conclusions  made  regarding 
the  efficacy  of  the  various  approaches  should  be 
made  in  light  of  the  version  actually  examined. 
Extreme  care  was  exercised  in  determining  the 
strategy  to  be  followed;  author's  recommendations 
were  usually  adopted.     However,  it  is  possible 
that  a  given  approach  could  record  lower  solution 
times  on  a  given  class  of  problems  using  a  variant 
of  the  solution  strategy  examined. 
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Abs  tract 


This  paper  presents  the  results  of  an 
experiment  to  determine  the  relationship 
between  parameters  which  can  be  used  to  describe 
linear  programs  and  standard  measures  that  can 
be  used  to  describe  the  performance  of  the 
simplex  algorithm.     The  test  problems  are  gener- 
ated by  LPGENR,  a  fortran  subroutine  that  gener- 
ates linear  programs,  and  solved  by  MPS/360. 
Problems  vary  in  size  from  50  rows  and  100 
columns  to  150  rows,  300  columns  and  50  upper 
bounds  with  varying  densities  and  solution 
characteristics.     Correlations  and  least  squares 
fits  are  calculated  to  determine  the  relation 
between  parameters  and  performance  measures . 
Several  unusual  relationships  are  reported. 


1.  Introduction 

Several  experimental  studies  in  linear 
programming  have  appeared  in  the  literature 
(see  [11,  [2],  [5],  [6]),  but  much  of  the  experi- 
mental work  is  unpublished  and  is  transmitted 
as  folklore.     For  example,   it  is  often  stated 
that  the  number  of  iterations  is  between  one  and 
three  times  the  number  of  rows.     This  study  was 
initiated  to  examine  the  relationship  between 
several  parameters  including  some  often  quoted 
ones  and  some  performance  measures  of  the  simplex 
alogrithm.     At  the  same  time  the  performance  of 
LPGENR  (see  [31)  could  be  examined. 


2.  Software 

LPGENR  (see  [3]  and  [8])   is  a  fortran  sub- 
routine system  which  generates  LP  problems  with 
prespecified  characteristics  and  a  known  optimal 
solution  using  a  pseudo-random  number  generator. 
Options    allow   the  parameters  to  be  passed  or 
generated  with  a  specified  range.  Further, 
output  may  be  obtained  in  several  forms  includ- 
ing MPS  card  image  format.     A  brief  description 
of  the  algorithm  for  generating  the  problems  is 
given  in  appendix  one. 

The  generated  problems  were  solved  using 
IBM's  MPS/360  with  no  default  changes  on  the 
LSU  IBM  360/65  under  the  OS/MVT  operating  system. 
This  later  proved  to  be  a  problem  since  the 
XCLOCKSW  switch  was  on.     This  switch  causes  an 


INVERT  demand  to  be  controlled  by  the  wall  clock. 
Since  the  operating  system  is  mul t iprogrammed , 
reinversion  was  performed  more  frequently  than 
desired,  sometimes  only  three  iterations  apart 
and  in  a  manner  not  completely  predictable.  About 
16  problems  were  rerun  and  are  presented  in  the 
appendix . 

The  statistical  analysis  was  performed  by 
the  Statistical  Analysis  System  (SAS)   (see  [7]). 


3.  Experiment 

The  parameters  chosen  for  study  were:  the 
number  of  rows  varying  from  50  to  150,   the  number 
of  columns  varying  from  100  to  300,   the  matrix 
density  varying  from  about  .40  to  .04,   the  number 
of  upper  bounds  varying  from  0  to  half  the  number 
of  rows.     Further,  various  problem  types  were 
included  that  defined  the  characteristics  of  the 
optimal  solution. 

The  optimal  solution  characteristics  for 
this  study  can  be  divided  into  three  classes: 
the  type  of  optimal  solution,   the  size  of  the 
reduced  costs  at  optimality,  and  the  number  of 
variables  at  their  upper  bound.     There  were  three 
optimal  solution  types:     degenerate,  unique  with 
all  basic  variables  greater  than  zero,  and  alter- 
native.    For  degenerate  solutions  the  number  of 
basic  variables  at  zero  was  set  to  be  twenty-five 
percent  of  the  number  of  rows.     For  alternative 
optimal  solutions  the  number  of  nonbasic  columns 
with  reduced  costs  of  zero  was  set  to  be  ten 
percent  of  the  number  of  rows.     The  magnitude  of 
the  reduced  costs  of  nonbasic  columns  at  optimal- 
ity are  specified  by  multiplying  the  objective 
coefficient  which  would  produce  a  zero  reduced 
cost  by  the  factor,   1-r,  where  r  is  a  uniform 
random  variable  over  the  interval  0  to  either  0.1, 
0.5  or  1.0.     The  number  of  variables  at  bound  in 
the  optimal  solution  was  set  to  be  half  the 
number  of  upper  bounds. 

All  other  input  parameters  for  LPGENR  were 
held  constant.     Each  problem  was  a  maximization 
with  all  equality  rows.     The  nonzero  entries  in 
the  constraint  matrix  were  uniformly  distributed 
between  -2  and  10;   the  optimal  nonzero  primal  and 
dual  solution  values  from  which  the  cost  row  and 
righthand  side  are  generated  were  uniformly 
distributed  between  0  and  20;  all  values  were 
rounded  to  the  nearest  integer. 
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The  performance  measures  recorded  for  each 
test  problem  were  the  number  of  phase  one  itera- 
tions,  the  CPU  time  of  the  execution  step  and 
the  percent  by  which  the  generated  optimal  value 
deviated  from  the  calculated  optimal  value.  The 
CPU  time  includes  a  modest  amount  of  time  for 
setting  up  the  problem. 

It  was  hypothesized  that  problems  in  which 
nonbasic  variables  at  optimality  had  large 
reduced  costs  would  solve  faster  since  the 
optimal  columns  might  be  easier  for  the  algorithm 
to  choose.     Also,  problems  with  alternative 
optimal  solutions  would  be  easier  to  find  since 
the  algorithm  need  only  find  one  from  many.  It 
is  obvious  that  the  density  has  an  effect  on  the 
number  of  computations,  but  little  has  been  said 
concerning  its  effect  on  the  number  of  iterations 

4.  Results 


to  the  fit,  had  a  coefficient  of  1.3  and  the 
density  had  a  large  negative  coefficient. 

Wtien  phasd  one  iterations  is  the  dependent 
variable,    the  three  variables  contributing  most 
to  the  fit  are,   in  order,   the  number  of  rows, 
the  number  of  columns,   and  the  density.  The 
number  of  rows  is  more  important  in  explaining 
phase  one  iterations  since  the  criteria  is  simply 
to  find  a  feasible  solution.     Again  the  density 
produces  a  negative  influence. 

When  phase  two  iterations  is  the  dependent 
variable,   the  three  variables  contributing  most 
to  the  fit  are,   in  order,   the  number  of  columns, 
the  density  and  the  number  of  upper  bounds. 
Surprisingly  the  number  of  rows  has  a  negative 
coefficient,  but  contributes  very  little  to  the 
fit  indicating  that  its  effect  is  virtually 
negligible.     The  major  part  of  the  explanation 
is  due  to  the  number  of  columns  and  the  density 
which  again  shows  up  negative. 


A  total  of  239  problems  were  solved  and  the 
results  are  listed  in  appendix  two.     In  29  cases, 
tolerance  checks  made  after  reinversion  caused 
the  algorithm  to  terminate.     In  all  but  three 
cases,   these  can  be  recognized  by  a  nonzero 
value  in  the  last  column  of  the  table.     In  five 
cases   (two  of  the  five  are  replications)  indicated 
by  a  •'-  in  the  table,   the  solution  was  in  fact 
optimal  but  in  each  case  the  sum  of  infeasibil- 
ities  was  zero  and  several  degenerate  pivots  were 
taken  before  the  problem  was  declared  infeasible 
after  the  tolerance  check.     Some  problems,  that 
were  rerun  after  tolerance  checks  declared  them 
infeasible,  continued  passed  the  point  where  they 
were  previously  terminated  after  reinversion  and 
subsequently,   the  optimal  solution  was  found  and 
declared  as  such  indicating  that  in  a  sense  the 
problem  is  self-correcting. 

Least  squares  fits  and  correlation  coeffi- 
cients were  calculated  for  the  50  and  100  row 
problems  that  reached  the  optimal  solution.  The 
problems  with  150  constraints  were  not  included 
in  these  calculations  since  only  three  reached 
optimality  and  the  others  encountered  numerical 
problems.     The  correlation  coefficients  are  listed 
in  table  one.     Least  squares  fits  were  calculated 
with  phase  I  iterations,  phase  II  iterations, 
total  iterations  and  CPU  time  as  dependent 
variables.     The  calculations  are  presented  for 
CPU  time  but  not  discussed  because  of  the  wall 
clock  reinversion  demand.    All  fits  were  forced 
to  have  a  zero  intercept.     The  results  are 
presented  in  tables  two,   three,   four  and  five. 
The  values  in  each  row  are  the  least  squares 
coefficient  of  the  independent  variable  and 
the  standard  error  of  the  coefficient.     The  last 
three  variables  are  the  coefficients  for  the 
degenerate,  unique  and  alternative  optimal 
solutions  which  were  created  as  dummy  variables. 

When  total  iterations  is  the  dependent 
variable,   the  three  variables  contributing  most 
to  the  fit  are,   in  order,   the  number  of  columns, 
the  number  of  rows  and  the  density.     The  coeffi- 
cient of  the  number  of  rows  is  2.09  which  is 
consistent  with  folklore.     The  number  of  columns, 
the  independent  variable  which  contributed  most 


Although  the  dummy  variables  that  represent 
the  types  of  optimal  solution  are  not  among  the 
variables  contributing  most  to  the  fits,   it  is 
interesting  to  note  some  general  trends.  When 
comparing  the  three  solution  types  it  is  per- 
haps easiest  to  consider  the  differences  between 
the  coefficients.     In  each  fit  using  iterations 
the  difference  between  the  coefficients  for 
degenerate  and  either  unique  or  alternative 
solutions  indicates  that  degenerate  solutions 
require  more  iterations  than  either  unique  or 
alternative  optimal  solutions. 

5.     Discussion  of  the  Results 

The  most  surprising  results  concern  the 
reduced  cost  perturbation  and  the  density.  The 
different  reduced  perturbations  produced  virtu- 
ally no  effect  on  any  of  the  performance  measures 
in  any  of  the  statistical  calculations.  Another 
surprising  result  is  the  negligible  effect  of 
the  number  of  rows  on  phase  two  iterations. 

Although  the  density  had  a  positive 
influence  on  CPU  time,   it  produced  a  negative 
effect  on  the  iteration  measures.     That  is, 
lower  density  problems  required  more  iterations 
than  problems  with  higher  density.     A  problem 
with  lower  density  could  result  in  more  degener- 
ate pivots.     The  increased  iterations  when 
problems  have  degenerate  optimal  solutions  could 
also  be  the  result  of  a  greater  amount  of  degen- 
erate pivots  on  the  way  to  the  optimum.  Both 
of  these  speculations  support  the  folklore  that 
degenerate  problems  often  require  more  itera- 
tions . 

To  our  knowledge  this  is  the  first  study 
that  has  attempted  to  consider  the  joint  effect 
of  a  number  of  parameters.     A  question  not 
answered  by  this  study  is  how  generated  problems 
compare  to  real  world  problems.     One  conclusion 
that  can  be  made  is  that  the  generator  can  be 
specified  to  create  problems  that  confuse  the 
algorithm.     For  example,   the  problems  that  were 
declared  infeasible  when  they  were  optimal. 
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Correlation  Coeff icieats 


Phase  I  Phase  II  Total  CPU 

iterations  iterations  iterations  t Ime 

number  of  rows                                  .49  -.31  .13  .42 

number  of  columns                            .34  .85  .66  .39 

number  of  upper  bounds                     .13  .04  .10  .12 

density                                             -.38  -.20  -.33  .01 

reduced  cost  perturbation               .02  -.01  .00  .00 


table  one 


Dependent  variable:     phase  one  iterations 
2 

R    =  .93 


independent  variable 

coeff. 

std.  err. 

number  of  rows 

2.31 

0.16 

number  of  columns 

0.55 

0.04 

number  of  upper  bounds 

-0.14 

0.20 

density 

-190 

31.5 

reduced  cost  perturbation 

2.13 

8.82 

degenerate  optimum 

-24.8 

18.8 

unique  optimum 

-50.7 

18.4 

alternative  optimum 

-43.1 

18.7 

table  two 


Dependent  variable:     phase  two  iterations 

2 

R    =  .91 


independent  variable 

coeff. 

std.  err. 

number  of  rows 

-0.22 

0.12 

number  of  columns 

0.74 

0.03 

number  of  upper  bounds 

0.32 

0.14 

dens  ity 

-106 

22.8 

reduced  cost  perturbation 

-2.90 

6.37 

degenerate  optimum 

2.74 

13.6 

unique  optimum 

-20.9 

13.3 

alternative  optimum 

-20.5 

13.5 

table  three 


Dependent  variable:  total 

iterations 

Dependent  variable:  CPU 

time 

2 

R    =  .93 

r2  =  . 

84 

independent  variable 

coeff. 

std.  err. 

independent  variable 

coeff . 

std.  err 

number  of  rows 

2.09 

0.26 

number  of  rows 

2.27 

0.18 

number  of  columns 

1.30 

0.07 

number  of  columns 

0.60 

0.05 

number  of  upper  bounds 

0.18 

0.31 

number  of  upper  bounds 

0.005 

0.21 

density 

-296 

49.8 

density 

106 

34.2 

reduced  cost  perturbation 

-0.76 

13.9 

reduced  cost  perturbation 

-1.89 

9.57 

degenerate  optimum 

-22.1 

29.8 

degenerate  optimum 

-160 

20.4 

unique  optimum 

-71.6 

29.2 

unique  optimum 

-177 

20.0 

alternative  optimum 

-63.6 

29.6 

alternative  optimum 

-159 

20.2 

table  four  table  five 
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Appendix  One 

LPGENR  Algorithm  for 
Generating  Linear  Programs 


Given  the  following  linear  program 


cx 

Ax  =  b 
0  <  X  <  d 


A  is  m  by  n 


LPGENR  generates  c,  A,  b  and  d  as  follows: 
step  1:     Generate  x"  >  0,  an  optimal  solution 
and  (u'',  v")  >  0,  an  optimal  set  of 
multipliers.     A  degenerate  optimal 
solution  will  have  more  than  n-m 
variables  in  x    at  0;  a  unique  optimal 
solution  will  have  exactly  m  variables 
greater  than  zero;  an  optimal  solution 
with  alternative  optima  will  have  more 
than  m  variables  greater  than  zero. 

Step  2:     for  j  =  1  to  n 

if  V*  >0,  d.  =  x* 
J  J  J 

if  V*  =0,  dj  =  X*  (1  *  URV[0,  .5]) 

step  3:     for  i  =  1  to  m  and  j  =  1  to  n 

if  URV[0,  1]<  density,  a. .  =  URV[-2,10] 
otherwise,  a.  .  =  0 


step  4:     for  i  =  1  to  m 


b.  =  A  x"-' 
1  1 


step  5:     for  j  =  1  to  n 

if  x*>  0,  c.  =  u"A.  +  V* 
J  J  J  J 

if  X*  =  0,  c.  =  (u*A.  +  vt) (l+URVr0,tl) 

where  +  is  determined  by  the  sign  of 

u*A .  +  v*  and  t  is  the  reduced  cost 
J 

perturbation. 


Note:     URV[e,f]is  a  uniform  pseudo  random  number 
between  e  and  f. 
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Appendix  Two 


NUMdEf7    OF  ROWS 


NUMBER    OF  COLUMNS 


NI.KBER    OF    UPPER  BOUNDS 


100 


HEDUCEC 

CPT  . 

I  TERAT IONS 

CPU 

COST 

SOLN. 

TIME 

NS  ITY 

PERTURB. 

TYPE 

PHASE  1 

PHASE  2 

TOTAL 

(SEC) 

O.ca 

0.1 

1 

107 

54 

161 

27.9 

0.08 

0  .5 

1 

107 

54 

161 

28.4 

o.ca 

1  .0 

1 

102 

38 

1  40 

25.9 

0 .  oa 

0.1 

0 

143 

75 

218 

36.8 

0.03 

0  .5 

0 

143 

68 

2  1  1 

36.  8 

0  .  CS 

1  .0 

0 

14  3 

78 

22  1 

38.3 

O.ca 

0.1 

2 

100 

24 

124 

24.  9 

0  .08 

0  .5 

2 

100 

24 

124 

23.6 

o.ca 

t  .c 

2 

100 

25 

125 

25.3 

0.17 

0.1 

1 

1  12 

4C 

152 

40.8 

0.17 

0.5 

1 

1  1  2 

40 

152 

39.7 

C.  17 

1  .0 

1 

1  12 

40 

152 

39.  1 

C  .  17 

0.1 

0 

149 

55 

2C4 

50.2 

0.17 

0  .5 

0 

149 

c  c; 

204 

48.6 

0.17 

1  .0 

0 

149 

63 

2  12 

49.  3 

0.17 

0.1 

2 

1  13 

57 

1  70 

43.5 

C  .  17 

0  .5 

2 

113 

57 

1  70 

45.  1 

0.17 

1  .0 

2 

113 

57 

1  70 

44.1 

0.33 

0  .  1 

1 

95 

5C 

145 

56.6 

0  .  33 

C  .5 

1 

95 

50 

145 

55.5 

0  .  33 

1  .0 

1 

95 

50 

145 

53.  1 

0  .  33 

0  .  1 

0 

120 

36 

156 

56.  0 

0.33 

0  .5 

0 

120 

36 

156 

58.5 

0  .  33 

1  .c 

c 

120 

35 

155 

55.  5 

0.  33 

0.1 

2 

1  1  1 

28 

139 

49.3 

0.33 

0  .5 

2 

1  1  1 

28 

139 

49.  1 

0.33 

1  .0 

2 

1  1  1 

28 

139 

50.6 

PERCENT 
DEVIATION 
FROM  OPTIMUM 


0.0 
0.0 
C  .C 
0.0 
0.0 
0  .  0 
0.0 

c.c 

0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
CO 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 


NUMBER   OF  BOWS 


NUMBER    OF  COLUMNS 


NUMBER   OF   UPPER  BOUNDS 


5C 


100 


25 


REDUCED 
COST 


OPT  . 

SOLN, 


ITERATIONS 


CPU 
TIME 


PERCENT 
DEVI  AT  ION 


DENS ITY 

PERTURB. 

TYPE 

PHASE  1 

PHASE  2 

TOTAL 

(  SEC  ) 

FROM  OPT 

0.08 

0.1 

1 

126 

27 

153 

25.5 

0.0 

0  .08 

0  .5 

1 

1  26 

36 

162 

27.  0 

0.0 

0.08 

1  .0 

1 

126 

32 

158 

26.  1 

0.0 

0.08 

0.1 

0 

103 

34 

137 

26.  1 

0.0 

0.C8 

0  «s 

0 

103 

2C 

123 

23.4 

0.0 

0  .08 

1  .0 

0 

103 

24 

127 

24.9 

0.0 

0.08 

0.1 

2 

113 

33 

146 

28.6 

0.0 

0.08 

0  .5 

2 

113 

32 

145 

27.6 

0.0 

0  .08 

1  .0 

2 

113 

32 

145 

30.4 

0.0 

0.17 

0.1 

1 

107 

43 

150 

43.0 

0.0 

0  .  17 

0.5 

1 

107 

47 

154 

43.6 

0.0 

0.17 

1  .0 

1 

107 

44 

151 

43.5 

0.0 

0.17 

0.1 

0 

135 

72 

207 

59.5 

0.0 

0  .  17 

0.5 

0 

135 

72 

207 

57.4 

0.0 

0.17 

1  .0 

0 

135 

80 

215 

52.4 

0.0 

0.17 

0.1 

2 

125 

23 

148 

49.3 

0.0 

0  .  17 

0.5 

2 

125 

32 

157 

49.9 

0.0 

0.17 

1  .0 

2 

125 

30 

155 

44.6 

0.0 

0.33 

0.1 

1 

1  1  1 

38 

149 

55.2 

0.0 

0.33 

0  .5 

1 

1  1  1 

33 

144 

58.7 

0.0 

0.33 

1  .0 

1 

111 

32 

143 

64.2 

0.0 

0.33 

0.1 

0 

1  16 

51 

167 

60.6 

CO 

0.33 

0  .5 

0 

116 

48 

164 

61.1 

0.0 

0.  33 

1  .0 

0 

1  16 

53 

169 

61.6 

0.0 

0.33 

0.1 

2 

117 

56 

173 

67,6 

CO 

0.33 

0  .5 

2 

117 

66 

185 

63.2 

CO 

0.33 

1  .0 

2 

1  17 

67 

184 

70.4 

0.0 

255 


NUMBER    OF  RCWS 

NUMBf^R    OF  COLUMNS 

NLMBER 

OF  UPPER 

BOUNDS 

SO 

200 

0 

HtOUCEC 

CPT  . 

I  TEHAT IONS 

CPU 

PERCENT 

CQ5T 

SOLN. 

Tl  ME 

DEVI  AT  I CN 

DENS  ITY 

PERTURe. 

TYPE 

PHASE  I 

PHASK  2 

TOTAL 

(  SEC  ) 

FROM  OPTIMUM 

C  .08 

0  .  I 

I 

182 

1  2C 

302 

74.2 

0.0 

0  .  08 

0.5 

I 

182 

1  2C 

3C2 

69.  9 

0.0 

0  .C8 

1  .0 

1 

182 

12C 

3C2 

74.0 

o.c 

C.  1  7 

0  .  I 

1 

164 

77 

24  1 

8C.4 

0.0 

0.17 

0  .5 

1 

164 

72 

236 

79.2 

0.0 

0.17 

1  .0 

1 

164 

77 

241 

eo.  6 

0.0 

0  .  3J 

0  .  1 

1 

127 

1  C5 

232 

119.1 

o.c 

r  .33 

0  .5 

1 

127 

1  15 

242 

119.2 

0.0 

0  .  33 

1  .0 

1 

127 

105 

232 

113.8 

0.0 

0.  08 

0  .  1 

0 

150 

144 

294 

61.7 

0.0 

0  .  C8 

■)  .5 

0 

150 

I  45 

295 

63.  3 

0  .0 

C  .  08 

1  .0 

0 

150 

1  17 

26  7 

59.  C 

0.0 

0.17 

0.1 

0 

190 

1C2 

292 

92.7 

0.0 

0.17 

0  .5 

0 

190 

ICO 

290 

93.6 

0.0 

0.  17 

1  .0 

0 

190 

102 

292 

92.6 

0.0 

0  .  34 

0.1 

0 

1  1 1 

1  1  1 

222 

1  C8.  7 

0.0 

C  .34 

.5 

0 

i  1 1 

1  05 

216 

106.1 

0.0 

0.  34 

1  .0 

0 

111 

96 

2C9 

1  C7.  C 

0.0 

C  .  08 

0  .  1 

2 

170 

ee 

258 

54.8 

CO 

C  .08 

0  .5 

2 

162 

105 

267 

57.3 

0.0 

0.08 

1  .0 

2 

170 

ee 

258 

55.  8 

0.0 

0.17 

0.1 

2 

140 

ice 

246 

85.7 

0.0 

C  .  17 

0.5 

2 

140 

110 

25C 

84.6 

o.c 

0.17 

1  .c 

2 

140 

109 

249 

92.2 

0.0 

0.33 

0  .  1 

2 

1  to 

99 

2C9 

115.6 

0.0 

0  .  33 

0.5 

2 

110 

99 

2C9 

116.3 

0.0 

0  .  33 

t  .0 

2 

110 

99 

2C9 

1G7.2 

0.0 

NUMBER   OF    ROWS  NUMBER    OF    COLUMNS  NUMBER    OF    UPPER  BOUNDS 


50 

200 

25 

REDUCED 

CPT  . 

I  TERATIONS 

CPU 

PERCENT 

COST 

SOLN. 

T  I  ME 

DEVI  AT  I CN 

DENS  ITY 

PERTURE . 

TYPE 

PHASE 

1      PHASE  2 

TOTAL 

(SEC) 

FROM  OPTIMUM 

C  .08 

T  .  1 

lie 

1  C4 

214 

47.4 

0.0 

0.08 

0  .5 

1  10 

125 

235 

54.  7 

0.0 

0  .08 

1  .0 

110 

145 

255 

57.6 

0.0 

0.17 

0.1 

219 

92 

311 

99.8 

0.0 

0.17 

0.5 

219 

104 

323 

104.6 

0.0 

0.17 

1  .0 

2  19 

107 

326 

99.  6 

0.0 

0  .  33 

0  .  1 

151 

14C 

291 

138.7 

0.0 

0.  33 

0.5 

151 

144 

295 

144.9 

0.0 

0.33 

I  .0 

151 

118 

269 

1  36.2 

0.0 

0.08 

0  .  1 

0 

164 

172 

336 

64.2 

0.0 

0.08 

0  .5 

0 

164 

202 

366 

67.  1 

0.0 

0.08 

1  .0 

0 

165 

15C 

315 

65.  3 

0.0 

0.17 

0.1 

0 

159 

107 

266 

89.4 

0.0 

0.17 

0  .5 

0 

159 

1  le 

277 

91.4 

0.0 

0.17 

I  .0 

0 

159 

129 

268 

85.  1 

0.0 

0.  34 

0.1 

0 

142 

146 

288 

140.8 

0.0 

0  .  34 

0.5 

0 

142 

I  32 

274 

1  38.0 

0.0 

0.34 

1  .0 

0 

142 

144 

286 

134.0 

0.0 

0  .  08 

0.1 

2 

158 

122 

280 

59.6 

0.0 

0.03 

0.5 

2 

158 

1  19 

277 

57.4 

0.0 

0  .C8 

1  .0 

2 

158 

123 

281 

59.3 

0.0 

0.17 

3  .  1 

2 

139 

67 

226 

76.4 

0.0 

0.17 

0  .5 

2 

139 

90 

229 

71.9 

0.0 

0.17 

1.0 

2 

139 

144 

78.5 

0.0 

0  .  33 

0.1 

2 

148 

1  24 

272 

1  32.3 

0.0 

0  .33 

0  .5 

2 

148 

104 

252 

126.2 

0.0 

0  .  33 

1  .0 

2 

148 

104 

252 

124.  1 

0.0 
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NUMflEH    OF    KCWS  NUMBER    OF    COLUMNS  NUMBER    OF    UPPER  BOUNDS 


SO 

300 

C 

REDUCED 

CPT  . 

I  TERAT ICNS 

CPU 

PERCENT 

COST 

SQLN. 

T  I  ME 

OE V  I  AT  r  CN 

DENS ITY 

PERTURB . 

TYPE 

PHASE 

1      PHASE  2 

TOTAL 

(  SEC  ) 

FROM  OPTIMUM 

0  .  08 

G  .  1 

1 

156 

1  6  1 

3  1  7 

62. 2 

0  .  0 

0.08 

0  .5 

1 

156 

1  42 

298 

77.5 

0.0 

c.oa 

1  .0 

1 

156 

1 3<; 

295 

76.  1 

0.0 

: .  17 

0  .  1 

1 

179 

1  3C 

3C9 

116.4 

O.fl 

0.17 

0  .5 

1 

179 

1  3C 

3C9 

115.4 

0  .  0 

0.17 

1  .0 

1 

179 

1  3C 

309 

118.7 

CO 

C  .  33 

0.1 

1 

176 

1  34 

3  IC 

161.4 

o.r 

C  .  33 

?  .5 

1 

176 

I  18 

294 

151.6 

C  .  0 

0  .33 

1  .0 

1 

176 

12C 

296 

1  44  .  G 

0.0 

0.08 

0  .  1 

0 

239 

2  19 

458 

1  CO. 6 

0.0 

0  .C8 

0.5 

0 

240 

22C 

460 

98.3 

C  .0 

0  .  08 

1  .0 

0 

244 

243 

487 

113.8 

CO 

0.17 

1 . 1 

0 

237 

209 

446 

152.  3 

0  .  0 

0.17 

0.5 

0 

237 

257 

494 

171.2 

CO 

C  .  17 

1  .c 

0 

237 

2ce 

443 

I  54.6 

CO 

0  .  33 

0  .  1 

n 

169 

16  1 

33C 

1  76.  I 

0.0 

0.33 

0  .5 

0 

169 

139 

3CB 

162.9 

0  .  0 

0.33 

1  .0 

0 

169 

133 

3C2 

164.9 

0.0 

0.08 

'5.1 

2 

181 

165 

346 

87.9 

CO 

o.ca 

0  .5 

2 

181 

169 

35C 

86.5 

CO 

0.08 

1  .0 

2 

181 

20C 

36  1 

9  1.7 

0.0 

0  .  17 

0  .  1 

2 

179 

132 

311 

1  1  J  .  6 

0  .  c 

0.17 

0  .5 

2 

179 

1  1  3 

292 

10  5.6 

CO 

0.17 

1  .0 

2 

179 

1  13 

292 

103.7 

CO 

0.33 

0  .  I 

2 

113 

144 

257 

146.6 

0.0 

0.33 

0.5 

2 

1  13 

144 

257 

140.9 

0.0 

0  .  33 

I  .0 

2 

1  1  3 

1  36 

25  1 

14C.C 

CO 

NUMBER   OF  ROWS 

NUMEER    OF  COLUMNS 

NLMBER 

OF  UPPER 

BCUNDS 

50 

300 

25 

REDUCED 

CPT  . 

I  TERAT IONS 

CPU 

PERCENT 

COST 

SOLN. 

T  I  ME 

DEVI  AT ICN 

DENS  ITY 

PERTURF. 

TYPE 

PHASE 

1      PHASE  2 

TOTAL 

(  SEC  ) 

FRCM  OPTIMUM 

0.C8 

0  .  1 

193 

205 

398 

101.9 

0.0 

0.08 

3.5 

193 

225 

4  I  8 

101  .6 

0.0 

0.08 

1.0 

193 

1  79 

372 

91.6 

CO 

0.17 

0.1 

159 

1  85 

344 

133.1 

CO 

0.17 

0  .5 

159 

1  75 

334 

1  26.  7 

Co 

0.17 

t  .0 

159 

1  79 

338 

131.8 

0.0 

0.33 

0.1 

162 

153 

315 

166.7 

0.0 

0.33 

0  .5 

162 

1  56 

318 

170.9 

0  .0 

0  .  33 

1  .0 

162 

1  43 

305 

162.2 

0.0 

0.08 

0.1 

0 

212 

225 

437 

1  04  .4 

0.0 

0  .08 

0  .5 

0 

212 

166 

378 

66.6 

0.0 

0.08 

1  .0 

0 

212 

157 

369 

97.2 

0.0 

0.17 

0.1 

0 

20  1 

193 

394 

137.1 

CO 

0.17 

0  .5 

0 

201 

1  58 

359 

117.8 

0.0 

0.17 

1  .0 

0 

201 

16C 

36  1 

125.7 

0.0 

0  .  34 

0  .  1 

0 

157 

167 

324 

169.2 

CO 

0.34 

0  .5 

0 

157 

156 

3  13 

17J.5 

O.C 

0  .  34 

1  .0 

0 

157 

167 

324 

1  77.  1 

0.0 

0.08 

0  .  1 

2 

201 

1  5  1 

352 

83.4 

0  .  0 

0.08 

0.5 

2 

201 

150 

351 

81.8 

0  .0 

0  .08 

1  .0 

2 

201 

161 

362 

87.  1 

CO 

0.17 

0.1 

2 

194 

1  15 

309 

116.9 

CO 

0.17 

3.5 

2 

194 

133 

327 

117.4 

0.0 

0.17 

1.0 

2 

194 

127 

321 

109. 0 

0.0 

0.33 

0.1 

2 

120 

1  57 

277 

151.4 

0.0 

0.33 

0.5 

2 

120 

142 

262 

151.3 

0.0 

0.33 

t  .0 

2 

120 

144 

264 

153.0 

0.0 
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NUVUER    OF    fiCwS  NUMBER    OF    COLUMNS  NLMBtH    OF    UPPER  BOUNDS 


1  0  u 

100 

c 

REDUCED 

CPT  . 

ITERAT  IONS 

CPU 

PERCENT 

COST 

SOLN. 

T  I  ME 

DE  V  I  AT  I CN 

DENS  ITY 

PERTURB . 

TYPE 

PHASE  1 

PHASE  2 

TOTAL 

(  SEC  > 

FRCM  OPTIMUM 

0  .  C8 

0.1 

I 

165 

0 

1  65 

5"^.  3 

0.0 

0.06 

0.5 

I 

165 

C 

165 

65.  1 

C  .  0 

0.08 

1  .C 

1 

165 

0 

165 

64.6 

0.0 

0.17 

C  .  1 

1 

180 

0 

180 

93.3 

0.0 

C  .  17 

0  .5 

I 

180 

0 

180 

94  .  9 

0.0 

0.17 

1  .0 

1 

180 

0 

160 

93.5 

CO 

0  .  33 

0.1 

I 

187 

c 

167 

112.7 

0.0 

0  .  33 

C  .5 

187 

0 

1  67 

143.5 

0.0 

0  .33 

1  .0 

187 

0 

167 

148.1 

0.0 

0  .  08 

')  .  1 

0 

1  65 

8 

173 

59.6 

0.0 

0.03 

0  .5 

0 

166 

1  3 

179 

P  1  .  9 

CO 

0.08 

1  .0 

0 

165 

7 

1  72 

59.6 

0.0 

0.17 

0.1 

0 

186 

e 

192 

103.5 

0.0 

0.17 

0  .5 

0 

1  66 

7 

193 

120.  1 

0.0 

0.17 

1  .0 

0 

186 

7 

193 

I  C5.  4 

0.0 

C.33 

0.1 

0 

176 

C 

176 

139.9 

0.0  -K 

0  .  33 

0.5 

0 

175 

0 

175 

129.5 

0.0  yi 

0.33 

0.5 

0 

176 

(3 

176 

140.4 

0.0  -V- 

0  .33 

1  .C 

0 

175 

C 

175 

1  28.  6 

co  * 

0.33 

1  .C 

0 

176 

0 

176 

136.2 

CO  *  -♦- 

0  .  08 

0.1 

2 

lai 

5 

166 

55.5 

1  .80 

0.08 

0,1 

2 

164 

C 

164 

59.7 

0  .  0 

0.C8 

0  .5 

2 

164 

0 

164 

60.8 

CO 

0.C8 

1  .0 

2 

164 

0 

164 

59.4 

CO 

0.  17 

0.1 

2 

179 

0 

1 7y 

112.6 

0.0 

0.17 

0.5 

2 

179 

c 

179 

92.8 

0  .0 

0.17 

1  .0 

2 

179 

0 

179 

117.1 

U  .  0 

0.  33 

0.1 

2 

163 

0 

16  3 

142.9 

0.0 

0  .33 

0  .5 

2 

163 

0 

163 

146.  8 

0.0 

0  .33 

1  .c 

2 

163 

0 

163 

1  46  .  8 

0.0 

NUMBER    GF  ROWS 

NUMBER 

OF  COLUMNS 

NUMBER 

OF  UPPER 

BOUNDS 

100 

100 

50 

REDUCED 

CPT  . 

ITERATIONS 

CPU 

PERCENT 

COST 

SOLN. 

T  I  ME 

DEVIATION 

DENS ITY 

PERTURB. 

TYPE 

PHASE  1 

PHASE  2 

TO  TAL 

(  SEC  ) 

FRCM  OPTIMUM 

0.08 

0.1 

147 

e 

155 

60.3 

CO 

0.08 

C  .5 

147 

9 

156 

58.  7 

CO 

0.08 

1  .C 

147 

13 

160 

65.7 

CO 

0.17 

0  .  1 

176 

2  1 

197 

114.0 

CO 

0.17 

0.5 

1  77 

7 

164 

96  .  9 

0.  G 

C  .  17 

1  .c 

177 

182 

IOC.  8 

CO 

0.33 

0.1 

1  76 

7 

183 

1  36.2 

CO 
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Abstract 

Recently,  it  has  been  shown  that  a  class  of 
penalty  function  algorithms  can  readily  be  adapt- 
ed to  generate  sensitivity  analysis  information 
for  a  large  class  of  parametric  nonlinear  pro- 
granming  problems.    In  particular,  estimates  of 
the  partial  derivatives  (with  respect  to  the  prob- 
lem parameters)  of  the  ccmponents  of  a  solution 
vector  and  the  optimal  value  function  have  been 
successfully  calculated  for  a  number  of  nontrivial 
exanples.    The  approach  has  been  inplemented  using 
the  well-known  Sequential  Unconstrained  tlinimiza- 
tion  Technique  (SUTTT)  ccmputer  program.  This 
paper  briefly  summarizes  these  results,  presents 
additions  to  the  ccmputer  program  that  include 
a  screening  device  for  eliminating  calculations 
associated  with  less  irrportant  parameters,  and 
illustrates  the  kind  of  information  that  can  be 
generated  by  applying  the  technique  to  a  well- 
known  inventory  model. 


i.  Introduction 

Initial  numerical  results  resulting  fran  the 
implatientation  of  a  penalty  function  technique 
for  obtaining  sensitivity  information  in  parame- 
tric nonlinear  prograiraning  were  given  by  Armacost 
and  Fiacco  (1974) .    The  work  is  based  on  the 
theory  developed  by  Fiacco  and  Mccormick  (1968) 
and  extended  by  Fiacco  (1973) .    This  paper  reports 
on  refinements  and  extensions  of  the  conputation- 
al  procedures  inplemented  by  Armacost  and  Inlander 
(1973)  and  Armacost  (1976)  ,  using  the  SUTTT- 
Version  4  corputer  code  v/ith  the  logarithmic- 
quadratic  loss  penalty  function  to  estimate  the 
partial  derivatives  of  the  solution  point  and  the 
objective  function  optimal  value,  the  derivatives 
here  being  taken  with  respect  to  the  specified 
problem  parameters. 

Fiacco  (1973)  developed  the  necessary  general 
formulas  for  the  partial  derivatives  of  the  "opti- 
mal value  function,"  the  catponents  of  a  local 
solution  point  and  its  associated  optimal  Lagrange 
multipliers,  for  a  large  class  of  parametric  non- 
linear prograitming  problems  cotposed  of  twice  dif- 
ferentiable  functions.    He  also  obtained  approxi- 
mation fonnulas  in  terms  of  the  well-known 
logarithmic-quadratic  penalty  function.  Recently, 
Armacost  and  Fiacco  (1975)  particularized  and 
sirtplified  these  formulas  for  various  problem 
structures  and  developed  formulas  for  the  first 


and  second  derivatives  of  the  optimal  value  func- 
tion of  the  given  problem.    Additionally,  Armacost 
and  Fiacco  (1976)  have  applied  the  general  theory 
to  easily  prove  the  well-known  result  that,  when 
the  parameters  are  the  right-hand  side  corponents 
of  the  constraints,  the  optimal  Lagrange  multi- 
pliers give  the  gradient  of  the  optimal  value 
function  (with  respect  to  the  parameters) .  Further, 
it  was  shown  that  the  first  derivatives  of  the 
Lagrange  multipliers  give  the  corrponents  of  the 
Hessian  of  the  optimal  value  function,  and  explic- 
it formulas  were  developed  for  the  Hessian  in 
terms  of  the  problem  functions. 

In  their  first  report  on  computational  exper- 
ience, Armacost  and  Fiacco  (1974)  concentrated 
primarily  on  presenting  coitputational  experience 
associated  with  the  calculation  of  the  first  der- 
ivatives of  a  local  solution  point.    The  practical 
irtplementability  of  the  approach  was  demonstrated. 

In  a  subsequent  paper,  Armacost  (1976)  re- 
ported on  additional  conputational  experience, 
focusing  on  the  calculation  of  the  derivatives  of 
the  optimal  value  function  and  the  Lagrange  multi- 
pliers, also  irrplementing  a  potentially  valuable 
refinement  that  allows  for  computerized  screening 
for  "key"  parameters.    This  paper  may  be  regarded 
as  a  continuation  and  artplification  of  the  Arma- 
cost paper. 

For  problems  involving  a  large  number  of 
parameters,  a  very  large  number  of  partial  deri- 
vatives may  be  calculated  if  one  proceeds  indis- 
criminately.   This  is  not  only  time-consuming, 
but  may  also  be  quite  burdensome  to  a  user  who 
must  evaluate  the  overall  significance  of  the 
results.    One  measure  of  the  latter  is  the  effect 
of  a  perturbation  on  the  solution  value.     It  is 
quite  possible  and  often  observed  in  practice 
that  the  optimal  objective  function  value  is  much 
more  sensitive  to  a  few  of  the  many  parameters 
present.    With  this  in  mind,  the  method  developed 
by  Armacost  and  Fiacco  (1975)  to  estimte  the 
first  order  sensitivity  of  the  optinral  value  func- 
tion was  incorporated  in  the  coitputer  program  to 
provide  an  option  for  preliminary  screening  of  ftie 
parameters  to  eliminate  further  calculations  in- 
volving perturbations  of  parameters  having  "little" 
effect  on  the  optimal  value  function.     (A  user 
can  easily  introduce  his  own  criteria  of  signifi- 
cance in  this  determination. )    Using  the  formulas 
developed  by  Fiacco  (1973) ,  a  second  option  is 
included  which  permits  the  calculation  of  the 
sensitivity  estimates  for  the  Lagrange  multipliers. 


261 


In  Section  3,  a  sensitivity  analysis  is  con- 
ducted for  a  multi-item  inventory  model  developed 
by  Schrady  and  Choe  (1971)  for  the  U.S.  ttevy.  Ihe 
example  analyzed  is  the  same  small  one  treated  by 
Schrady  and  Choe,  though  readily  extended  to  a 
large-scale  model.    The  results  illustrate  the  po- 
tential value  of  a  detailed  automated  sensitivity 
analysis  in  practical  situations,  and  hopefully 
dramatize  the  numerous  rich  interpretations  and 
insights  that  can  be  derived  from  this  informa- 
tion, as  well  as  indicating  the  caution  that  must 
be  taken  in  naking  valid  inferences. 

The  recently  obtained  basic  theoretical  re- 
sults validating  the  conputational  algorithm  are 
summarized  rather  completely  in  the  next  section 
so  that  the  paper  might  be  self-contained. 

2.    Supporting  Theory 

Ihe  parametric  mathematical  prograirming  prob- 
lems considered  here  are  of  the  form 

minimize    f(x, e) 

X  E 

subject  to    g^(x,e)  >_  0  ,    i=l,  ,m  ,  P(e) 

hj(x,e)  =  0  ,     j=l,...,p  , 

where    x    is  the  usual  vector  of  variables  and  e 
is  a  k-component  vector  of  numbers  called  "param- 
eters."   It  is  desired  ultimately  to  develop  a 
carplete  characterization  of  a  solution    x(e)  of 
Problem  P{e)  as  a  function  of    e  .     In  our  cur- 
rent work,  we  have  concentrated  on  certain  re- 
cently oonputationally  tractable  measures  of 
change  in  a  solution  as    e    is  perturbed  from  a 
specified  value.     (Without  loss  of  generality,  we 
assume  that  the  specified  value  is    e  =  0  .) 

Vlhen  certain  assunptions  are  satisfied, 
Fiacco  (1973)  and  Armacost  and  Fiacco  (1975)  have 
characterized  the  "first  order  sensitivity"  of  a 
"Kuhn-Tucker  Triple"  and  the  first  and  second 
order  sensitivity  of  the  optimal  value  function 
of  Problem  P(£).     (These  quantities  are  defined 
as  the  theory  is  presented.)    Additionally,  they 
have  developed  formulas  for  efficiently  estimating 
this  sensitivity  when  the  logarithmic-quadratic 
loss  penalty  function  algorithm  is  used  to  solve 
Problem  P ( e ) .    The  main  theoretical  results  are 
sunmarized  here. 

The  Lagrangian  for  Problem  P(£)  is  defined  as 

L(x,u,w,e)   =  f (x,e)  -         u.g. (x,e) 

i=l    ^  ^ 

P 

+    I    w.h.  (x,e)  , 
j=l    3  3 

where    u^  ,  i=l,...,m    and    Wj  ,  j=l,...,p  are 

"Lagrange  multipliers"  associated  with  the  inequal- 
ity and  equality  constraints,  respectively.  Any 

vector  (x,u,w)  satisfying  the  usual  (first  order) 
Kuhn-Tucker  conditions  (Fiacco  and  McCormick,  1968) 

of  Problem  P(e)  is  called  a  Kuhn-Tucker  triple. 

The  following  four  assunptions  are  sufficient 
to  establish  the  desired  results  and  are  assumed 
to  hold  throughout  the  paper: 


Al  —  The  functions  defining  Problem  P(£)  are 
tvdce  continuously  dif f erentiable  in 
(x,e)     in  a  neighborhood  of    (x*,0)  . 

A2  —  The  second  order  sufficient  conditions  for 
a  local  minimum  of  Problem  P(0)  hold  at 
x*    with  associated  Lagrange  multipliers 
u*    and    w*  . 

A3  —  The  gradients    7^g^(x*,0)  (i.e., 

( ag^  (x* ,  0 )  /3x^ ,  . . . ,  3g^  (X* ,  0)  /3x^)  , 

the  superscript    T    denoting  transportatior 
for  all    i    such  that    qi(x*,0)  =  0  ,  and 
V  h.(x*,0)   ,  j=l,...,p    are  linearly  inde- 

X  ] 

pendent. 

A4  —  Strict  conplementar\/  slackness  holds  at 

(x*,0)   (i.e.,  ut"  >  0    for  all    i    such  that 

g^(x*,0)  =  0  )  . 

Theorem  1:     (Local  characterization  of  a  Kuhn- 
Tucker  Triple  (Fiacco,  1973)  of  Problem  P ( e ). )  If 
assunptions  Al,  A2,  A3  and  A4  hold -vf or  Problem 
P(£)     at  (x*,0)   ,  then 

(a)  X*    is  a  local  isolated  minimizing  point  of 
Problem  P(0)  and  the  associated  Lagrange 
multipliers    u*    and    w*    are  unique; 

(b)  for    e    in  a  neighborhood  of    0  ,  there 
exists  a  unique,  once  continuously  dif f er- 
entiable vector  function 

T 

y(E)  =  (x(e) ,u(e) ,w(e) )      satisfying  the 
second  order  sufficient  conditions  for  a 
local  minimum  of  Problem  P(£)  such  that 
T 

y(0)  =  (x*,u*,w*)    =  y*    and  hence,  x(£) 
is  a  locally  unique,  local  minimum  of  Prob- 
lem P(e)  with  associated  unique  Lagrange 
multipliers    u(e)     and    w(e)   ;  and 

(c)  for    £    near    0  ,  the  set  of  binding  inequal 
ities  is  unchanged,  strict  conplementary 
slackness  holds  for    u^ ( £ )     for    i  such 

that    g^(x(£),£)  =  0  ,  and  the  binding  con- 
straint gradients  are  linearly  independent 
at    x(e)  . 

Ihis  result  provides  a  characterization  of  a 
local  solution  of  Problem  P(e)  and  its  associated 
optir.ial  Lagrange  multipliers  near    e  =  0  .  It 
generalizes  a  theorem  first  presented  by  Fiacco 
and  McCormick  (1968,  Theorem  6)  and  is  closely 
related  to  a  generalization  of  the  same  theorem 
provided  independently  by  Eobinson  (1974) .  It 
shows  that  the  Kuhn-Tucker  triple    y(£)    is  unique 
and  well  behaved,  under  the  given  conditions. 
Since    y(E)     is  once  diff erentiable,  the  partial 
derivatives  of  the  conponents  of    y(E)    are  well 
defined.    Ihis  fact  and  Assimption  Al  also  mean 
that  the  functions  defining  Problem  P(£)  are  once 
continuously  dif ferentiable  functions  of    e  along 
the  "solution  trajectory"    x(e)    near    e  =  0  , 
and  the  Lagrangian  is  a  once  continuously  dif fer- 
entiable function  of    e    along  the  "Kuhn-Tucker 
point  trajectory. " 

We  are  thus  motivated  to  determine  a  means  to 
calculate  the  various  partial  derivatives,  since 
this  yields  a  first  order  estimate  of  the  locally 
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optinal  Kuhn-Tucker  triple  and  the  problem  func- 
tions near    e  =  0  . 

Denote  by    Vx(e)   e  (3x.(e)/3£.)    ,  i=l,...,n, 

j=l,  ,k  ,  the     n  X  k.    matrix  of  partial  deriva- 
tives of    x(£)    with  respect  to    e  ,  and  define 
V  u(e)    and    V  w(e)     in  a  similar  fashion.  Vie 

then  define  V^y{e)  e  (V^x(e) ,V^u(e) ,V^w(£) l"^  ,  an 
(n+mfp)  X  k  matrix. 

Vlhen    yie)     is  available,    ^^yie)    can  be  cal- 
culated by  noting  that  Conclusion  (b)  of  the 
theorem  implies  the  satisfaction  of  the  Kuhn- 
Tucker  conditions  for    P ( £ )    at    y ( £ )  near 
£  =  0  ,  i.e. , 

V^L[x(e) ,u(e) ,w(e) ,e]  =0  , 

u^{£)g^[x(£) ,£]  =  0  ,      i=l,...,m  ,  (1) 

hj[x(£),£]   =  0  ,       j=l,...,p  . 

Since  the  Jacobian    M{e)    of  this  system  with 
respect  to    (x,u,w)   (i.e. ,  the  matrix  obtained 
by  differentiating  the  left  side  of  (1)  with  res- 
pect to  the  conponents  of    (x,u,w)  )     is  non- 
singular  under  the  given  assurrptions ,  the  total 
derivative  of  the  system  with  respect  to    £  is 
well  defined  and  must  equal  zero,    this  yields 


M(e) v^y(£) 


N(e) 


where  N(e)  is  the  negative  of  the  Jacobian  of 
the  Kuhn-Tucker  system  with  respect  to  £  ,  and 
hence  _-, 

V^y(e)  =  M(e)  TJ(e)  . 

The  class  of  algorithms  based  on  twice  con- 
tinuously differentiable  penalty  functions  can  be 
used  without  additional  assunptions  and  without 
requiring    y  (  e  )     to  provide  an  estirtate  of 
V_^y(e)   .    Furthermore,  most  of  the  information 

required  to  make  the  estimate  is  already  avail- 
able in  the  typical  inplementations  of  these 
algorithms.    Here,  we  use  the  logarithmic- 
quadratic  penalty  function  for  Problem  PCe) 
(Fiacco  and  ffcCormick,  1968)  defined  as 


j=l,...,p)  and  such  that 
x(0,r)  ^  x(0,0)  =  x*  ; 


(2)     lim    r    I    In  g.  [x(0,r)]  =  0  ; 
r->0       i=l  ^ 


(3)  lim    (l/2r)      l    h  [x(0,r)  ,0]  =  0  ;  and 

j=l  ^ 

(4)  lim   VJ[x(0,r)  ,0,r]  =  f{x*,0)  . 
r^O 

The  following  theorem  extends  these  results  for 
Problem  P(£) ,  v*iere    e    is  allowed  to  vary  in  a 
neighborhood  of    0  ,  and  provides  a  basis  for 
approximating  the  sensitivity  information  associ- 


ated with  Problem  P(e).    The  notation    V^'J  denotes 

the  matrix  of  second  partial  derivatives  of  W 
with  respect  to    x  . 

Theorem  2:     (Relationship  of  solutions  of 
Problan  P(e)  and  minima  of    W(x,E,r)   ,  (Fiacco 
1973) .      If  Assunptions  Al  -  A4  hold,  then  in  a 
neighborhood  about    (E,r)  =  (0,0)    there  exists  a 
unique  once  continuously  differentiable  vector 

T 

function    y(£,r)  =  [x(£ ,r) ,u(E,r) ,w(£ ,r) ]  satis- 
fying 

V^L(x,u,w, e)  =  0  , 

u^g^(x,£)  =  r  i=l,...,m  , 

h_.  (x,e)  =  w^r  ,    j=l, . . .  ,p  , 

with    y(0,0)  =  (x*,u*,w*)    and  such  that,  for  any 
(E,r)    near    (0,0)    and    r  >  0  ,  x(E,r)     is  a 
locally  unique  unconstrained  local  minimizing  point 
of    W(x,E,r)   ,  g.[x(E,r),E]  >  0  ,  i=l,...,m  ,  and 
2 

V  W[x(E,r) ,E,r]     is  positive  definite. 


Corollary  2.1:     (Convergence  of  estimates  using 
W(x,£,r)   ,   (Fiacco,  1973).)     If  Assunptions  Al,  A2, 
A3  and  A4  hold  for  Problem  P(e) ,  then  for  any  e 
near    0  , 


W(x,E,r)   =  f (x,e)  -  r    I    In  g.  (x,e) 


+  (l/2r) 


P 
Z 
j=l 


i=l 


hJ(x,E) 


(2) 


Under  the  given  assunptions,  the  following 
facts  are  known  for  Problem  P(0)  from  penalty 
function  theory  (Fiacco  and  McCormick,  1968, 
Theorems  10  and  17) : 

(1)    For    r  >  0    and  small,  there  exists  a 
imique  once  continuously  differentiable 
vector  function    x(0,r)     such  that  x(0,r) 
is  a  locally  unique  minimizing  point  of 

W(x,0,r)     in    R^(0)   =  {x:  g^(x,0) 


>  0  ,  i=l,., 


,m  ,  and    h^  (x,0)  =  0 


(a)  lim    y(E,r)  =  y(E,0)  =  y(£)   ,  the  Kuhn- 
r-^+ 

Tucker  triple  characterized  in  Theoran  1; 
and 

(b)  lim     V  y(E,r)  =  V  y(E,0)  =  V  y(£)  , 
r-0+     '  =  ^ 

This  result  motivates  use  of    y^y(£,r)  to 

estimate    V^y(E)   ,  when    e    is  near    0    and  r 

is  near    0  ,  once    y(E,r)     is  available.    Theorem  2 
provides  the  basis  for  an  efficient  calculation  of 
V^y(£,r)   .    Since,  at  a  local  solution  point 

x(E,r)    of   W(x,E,r)   ,  it  follows  that 


V^W[x(E,r) ,E,r]  =  0  , 


(3) 


we  can  differentiate  (3)  with  respect  to  e  to 
obtain 
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v5'J[x(e,r)  ,e,r]  V  x(e,r) 


+  V  (V  W[x(e,r) ,E,r])  =  0 
e  X 


(4) 


By  Iheoran  2,  V  W    is  positive  definite  for 

^  2 
(e,r)    near    (0,0)     and    r  >  0  ,  so    V^W    has  an 

2 

inverse  and    V^x(e,r)  =  -V^V7[x(e,r) ,e,r]~l  . 
V^^W[x(e,r) ,e,r]  . 


Also,  since 


u^(e,r)  =  r/g^(x(e,r) , e)  ,  i=l, . . . ,m  , 
and 

Wj{e,r)  =      {x(e,r) ,e)/r  ,    j=l,...,p  , 


(5) 


(6) 


for    {e,r)    near  (0,0)     and    r  >  0  ,  these  equa- 
tions can  be  differentiated  with  respect  to  e 
to  obtain 


V^u^(e,r)  =  -(r/g^) [V^g^(x(e,r) ,e) 


V^x(e,r)  +  8g^(x(e,r) ,e)/3e]   ,  (7) 


V  w  (e,r)  -  (l/r) [V  h  (x(£,r) ,e)  • 

t    J  X  J 

V  x(e,r)  +  3h. (x(£,r) ,e)/9e]    .  (8) 

Solving  (4)  and  calculating  (7)  and  (8)  then 
yields  the  conponents  of    V^y(E,r)   ,  which  can  be 

used  to  estimate    V  y(e)     for    (e,r)  near 
(0,0). 

The  next  results  extend  this  theory  to  an 
analysis  of  the  optimal  value  function  of  Problem 
P(e)  along  the  Kuhn-Tucker  point  trajectory 

[x(e)  ,u(e)  ,w{e)  . 

The  optimal  value  function  is  defined  as: 

f*(e)    =  f  [x(e)  ,e]    ,  (9) 

and  the  "optimal  value  Lagrangian"  is  defined  as: 

L*{e)  =  L[x(e) ,u(£) ,w(e) ,£]   .  (10) 

Theorem  3 :     (First  and  second  order  changes  in 
the  optimal  value  function,  Armacost  and  Fiacco 
(1975).)     If  assuirptions  Al  -  A4  hold  for  Prob- 
lem P(e),  then  for    e    near    0  ,  f*(E)     is  a 
twice  continuously  dif ferentiable  function  of    £  , 
and 

(a)     f*(E)  =  L*(e)  ; 


(b)     V^f*(E)  =  V^L(x,u,w,£) 


(x,u,w)  =  {x(£) ,u(£)  ,w(e)  : 


=  V  f (x,e)  -    I    u. V  g. (x,£) 
i=l  ^ 


f'li 

i 

II 
iict 


+    I    w.V  h. {x,e) 
j=l 


(x,u,w)=(x(e) ,U(£) ,w(£) )  ; 


(C)     V^f*(e)  =V  (V  L(x(e)  ,U(£)  ,w(e)  ,£)'^)  . 

E  EE 


The  logarithndc-quadratic  loss  penalty  functio 
(2)  can  also  be  used  to  provide  estimates  of  the 
first  and  second  order  sensitivity  of  the  optimal 
value  function.  Let  the  optimal  value  penalty  fun 
tion  be  defines  as    ?7*(£,r)  =  W(x(£,r)  ,  E,r)  . 

Theorem  4:  (First  and  second  order  sensitivit 
of  \"J*(E,r)  and  estimates  for  f*(E)  ,  Armacost 
and  Fiacco  (1975) .)  If  Assunptions  Al  -  A4  hold 
for  Problem  P(e) ,  then  for  (£,r)  near  (0,0)  ani 
r  >  0  ,  W*(E,r)  is  a  twice  continuously  dif feren- 
tiable function  of    e  and 


till 


tt 


(a)     lim    W*{E,r)  =  L*(e)  =  f*(E)  ; 
r-0+ 


(b)     V  W*(£,r)  =  VWVX  +  VW=V  L(x,u,w,e) 
e  X      £  E  e 


(x,u,w)  =  (x(E,r)  ,u(E,r)  ,w(E,r)) 


(c)     lijti    V  V7*(£,r) 
r-0+  " 


=  V^L(x(e) ,u(e) ,w(e) ,£) 


=  V  f*(£)  ; 
e 


as 

iEt 


it 

B 

Jly 
in' 


■Stl 


(11) 


(d)         T"r*(£,r)  =  V^(V^L(x(E,r)  ,u(£,r)  ,w(e,r)  ,e) 


(e)     lim    V^'7*(E,r)  =  V^f*(£) 


Btl 


Oil 


ai 


r->0 


+  e 


This  result  provides  a  justification  for 

2 

estimating    f*(E)   ,  V^f*(E)     and    V^f*(E)  by 
W*(E,r)   ,  V^W*(E,r)    and    v^'7*{E,r),  respectively, 
vAien    r    is  positive  and  small  enough. 

Since  Corollary  2.1  and  continuity  irtply  that 
lim    f(x(E,r),e)  =  f*(e)   ,  another  estimate  of  the 

optimal  value  function  (9)  is  provided  by 

f  (E,r)   =  f(x(E,r),£)    when    r  >  0    and  small. 
Direct  application  of  the  chain  rule  for  differen- 
tiation then  yields,  for    x  =  x(E,r)  , 

V  f^(£,r)  =  V  f{x,E)V  x(E,r)  +  V  f(x,£).  (I 


Under  the  given  assuirptions,  continuity  also  assure 
that    V  f#  (E,r)  ^  V  f*(£)    as    r  -*  O"*"  .    Thus  both 


264 


l_f'(e,r)     and    V^W*{e,r)    are  estimates  of 

_f*(e)     for    r    sufficiently  small. 

It  should  be  noted  that  these  estimates  are 
jnctionally  related  since 

V  W*(e,r)  =  V  f**(e,r) 
e  e 

m 

-    I    u. (V  g. V  x(e,r)  +  V  g. ) 
P 

+    I   w.  (V  h.v  x(e,r)  +  V  h.) 
^=1    1    X  3  e  e  : 


I  X  =  x(e,r) 

*  #, 
ran  this  expression,  it  is  clear  that    V^f" (e,r) 

3  the  better  estimte  of    V^f*(e)   ,  the  remaining 

arms  in    V  W*(e,r)     sirply  constituting  "noise" 

nat  is  eliminated  as  r  ^  0  .  However,  by  using 
ne  expression  for    V^W*(e,r)    given  by  (11), 

^W*(e,r)    can  be  evaluated  without  necessitating 

16  calculation  of    V^x(e,r)   ,  which  is  required 

3  conpute  (12)  .    Ihus,  the  cruder  but  coitputation- 
lly  much  cheaper  estimate  of       f * ( e )     given  by 
quation  (11)  has  now  been  introduced  as  an  option 
Q  the  conputer  program  as  a  preliminary  screening 
evice  to  identify  crucial  paraneters.  Restric- 
ion  of  subsequent  calculations  to  these  parame- 
ers,  and  other  calculations  such  as  the  sharper 
stimate  of    Vf*(£)    given  by  (12)  are  provided  as 
dditional  options. 

In  summary,  the  basis  for  the  estimation  pro- 
adure  utilized  here  for  a  specific  problem,  say 
roblem  P(e),  is  the  minimization  of  the  penalty 
lonction    W(x,e,r)    given  by  (2).    This  yields  a 
oint    x(e,r)    v»4iich  may  be  viev^ed  as  an  estirrate 
Ea  (local)  solution    x(e)    of  Problem  P ( e ) .  The 
stimate    f(x(e,r),s)    of    f*(e)    is  imnediately 
/ailable  when    x(e,r)    has  been  determined.  The 
Jsociated  optimal  Lagrange  nultipliers    u(e)  and 
(e)    are  estimated  by  using  the  relationships 
Lven  in  (5)  and  (6) ,  respectively. 

If  desired,  all  of  the  first  partial  deriva- 
.ves  of    y(e,r)  =  (x(£,r) ,u(e,r)  ,w(£,r)T  „ith 
spect  to    E  ,  an  estimate  of    S/^y{c)   .  nay  be 
)tained  by  first  solving  (4)  and  then  applying 
')  and  (8).    If  the  full  matrix    V  x(E,r)  is 

£ 

ilculated,  then    V^f*(£)     is  estimated  by  (12). 

iwever,  if  it  is  desired  to  eliminate  calcula- 
ons  involving  parameters  having  less  effect  on 
■cal  changes  of    f*(E)   ,  the  screening  device 
■scribed  above  is  used.    This  entails  initial  es- 
mation  of    V^f*(£)    by  (11).    Conponents  of  e 

)rresponding  to  the  respective  ccnponents  of 
f*(E)     that  are  deemed  inconsequential,  may  then 

,!  deleted  and  would  not  enter  into  any  subsequent 
ilculations .    In  particular,  it  is  enphasized  that 
:*,u*,w*)   ,  f*(E)    and    V^f*(E)    may  be  estimated 

/   y(e,r)  f(x(e,r),e)  and  (lib),  respectively, 


without  calculating  any  components  of    V^y(e,r)  , 
once    x(E,r)     is  known. 

3.  Exaitple:    A  Large-scale  Multi-item 
Inventory  Model 

Traditionally,  inventory  models  have  been  formu- 
lated to  minimize  some  function  of  the  ordering, 
holding  and  shortage  (or  backorder)  costs  subject 
to  various  constraints.    Schrady  and  Choe  (1971) 
have  formulated  an  inventory  model  which  appears  to 
have  much  greater  relevance  for  an  inventory  system 
in  a  noncommercial  environment,  such  as  institution- 
al or  military.    The  costs  used  in  the  traditional 
models  may  be  quite  artificial  and  the  real  objec- 
tive of  the  system  is  often  maximization  of  a  mea- 
sure of  readiness  or  service,  here  assumed  to  be 
equivalent  to  minimization  of  stockouts.    In  addi- 
tion, the  stock  points  of  such  supply  systems  are 
inevitably  constrained  by  investment  and  reorder 
workload  limitations. 

Schrady  and  Choe's  multi-item  inventory  system 
assumes  these  constraints  along  with  the  specific 
objective  of  minimizing  the  total  time-weighted 
shortages.    The  decision  variables  are  taken  to  be 
the  "reorder  quantities"  and  the  "reorder  points," 
respectively,  how  much  to  order  and  when  to  order 
each  item  in  the  inventory.    A  three- item  exanrple 
problem  was  solved  by  Schrady  and  Choe  (1971)  using 
the  SUMT  conputer  code  (Mylander,  et  al . ,  1971). 
Subsequently,  McCormick  (1972)  showed  how  the  spe- 
cial structure  of  this  inventory  model  can  be  used 
to  facilitate  the  use  of  the  SUhW  code  to  solve 
very  large  inventory  problems.    He  also  extended 
the  model  to  include  constraints  on  storage  volume 
and  the  probability  of  depletion  of  critical  items. 

The  model  and  example  presented  here  are  the 
original  ones  due  to  Schrady  and  Choe.    The  penalty 
function  technique  described  in  the  preceding  sec- 
tion was  used  to  solve  the  example  and  calculate 
the  partial  derivatives  of  various  quantities  of 
interest,  with  respect  to  each  parameter  involved 
in  defining  the  model.     (The  analysis  can  be  applied 
to  the  extended  model  without  difficulty.) 

Detailed  development  of  the  model  is  beyond  the 
scope  of  this  paper.    The  interested  reader  is  re- 
ferred to  the  Schrady-Choe  and  IlcCormick  papers. 
Here,  we  give  a  summary  treatment  of  the  various 
conditions  and  relationships  upon  which  the  model 
is  based.    Vie  then  tabulate  the  results  obtained 
in  solving  the  resulting  nonlinear  programming  prob- 
lem and  applying  the  sensitivity  analysis  method- 
ology.   A  number  of  observations  and  interpreta- 
tions are  offered  to  illustrate  the  many  uses  to 
vdiich  the  sensitivity  information  might  be  applied. 

It  is  assumed  that  the  amount  of  each  item  in 
inventory  is  always  known,  that  all  demand  which 
occurs  v\4ien  the  on-hand  stock  is  zero  is  back- 
ordered,  and  that  the  demand  which  occurs  during 
the  time  between  the  placement  of  an  order  and  its 
receipt  by  the  stock  point  (i.e.,  the  "lead  time 
demand")  is  normally  distributed  with  known  mean 

u.    and  variance    a?  . 
1  1 

For  the  ith  item,  let 

=  item  unit  cost  (in  dollars) , 

A .  =  mean  demand  per  unit  time  (in  units) , 
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=  reorder  point, 
=  reorder  quantity, 

(J)  (x)  =  the  Normal  (0,1)  density  function, 

<I>(z)  =  /2<(>(x)dx  =  the  Normal  (0,1)  ccxplementary 

c     lative  dj.stribution  function. 

In  addition,  let    K^^    be  the  investment  limit 

in'  dollars,  Y.^    the  number  of  orders  per  unit  of 

time  that  constitutes  reorder  workload  limit,  and 
N    the  total  number  of  items  in  the  inventory. 

It  can  be  shown  that  the  expected  time-weighted 
shortage  of  item  i  at  any  point  in  time  is  given  by 

^i'Qi'^i^  =1;  t^i^^i^  -  ei(Qi+r.)] 


vSiere 


6.(r.)  =|[a2+  (r.-u.)^] 

(r.-u.|    a.  fr.-p.i 


The  expected  on-hand  inventory  of  item  i  is  given 
by        +  Q^/2  -       +  B^(Q^,r^)    and  the  expected 

number  of  orders  placed  per  unit  time  for  item  i 

is    X./Q.  . 
1  1 

Using  the  above  expressions  and  assunptions, 
Schrady  and  Choe  (1971)  indicate  that  meaningful 
ajproxiirations  of  the  given  quantities  are  ob- 
tained even  v*ien  the  second  term  is  dropped  frcm 
the  expression  for  the  expected  shortages,  and 
v*ien  the  last  term  is  dropped  from  the  expression 
for  expected  on-hand  inventory.    The  given  assunp- 
tions and  siiiplifications  then  lead  readily  to  the 
following  nonlinear  programming  problem  (vdiich 
Schrady  and  Choe  (1971)  proved  convex) , 


minimize  Z(Q,r) 
Q,r 


N 

Z 

i=l 


3.  (r.)/Q. 
1  11 


subject  to 

N 

g^(Q,r)   =       -         c^(r^-K3^/2-vi^)  >  0  , 


N 


(SO 


g2(Q,r)  =       -         ^^/^^  >  0  • 

with    r^    unrestricted  in  sign,         ^  0  , 

N  ,  Q=  (Q^  Q^)T  ,   ^^,T  ^ 

and    g^    and    g^    representing  the  investment  and 
workload  constraints,  respectively. 

The  pr(±)lem  data  for  the  Schrady-Choe  three- 
item  exanple  and  the  initial  starting  point  for  the 
SU^'^^  program  are  shewn  in  Table  1.    As  indicated  in 
the  table,  the  lead-time  demands  and  standard  devia- 
tions, the  item  unit  costs  and  nean  demands,  and  the 
investment  and  workload  limits  are  all  treated  as 
parameters  in  conducting  the  sensitivity  analysis. 


Table  2  gives  the  conputer  solution  and  Table- 
the  final  estimate  of  the  first  partial  derivativ 
of  the  optimal  value  function  Z*  with  respect  tf 
the  problem  parameters.  Relative  to  the  criteria 
used  in  the  conputer  program,  the  Table  3  results; 
indicate  that  the  optimal  value  function  is  sensi- 


tive to  paraneters  K_ 


and 


^2  '    1  '    2  '    2  '  3 
c^  .    Many  inferences  are  possible.    For  example,; 

the  fact  that  the  solution  is  particularly  sensi- 
tive to  the  values  of  the  standard  deviations  of  ' 
the  lead  time  demand  of  items  2  and  3  might  indi- 
cate that,  since  these  parameters  were  obtained  bi 
sampling,  additional  sanpling  of  these  lead  time 
demands  may  very  well  be  warranted  to  reduce  the 
associated  standard  deviations. 

Table  3  also  suggests  that  the  optimal  solutiil 
value  is  very  sensitive  to  all  of  the  item  costs.,! 
If  the  structure  of  Problem  (SC)  is  examined,  thii 
result  may  at  first  appear  contradictory  since  th 
c^    appear  only  in  the  investment  constraint  and 

the  optimal  value  function,  according  to  Table  3, 
is  apparently  not  very  sensitive  to  the  investmen 
limit  K]_  .  The  problem  is  one  of  precise  inter- 
pretation. The  partial  derivatives  measure  rate 
change.  But  inspection  of  the  investment  constra 
g-|^(Q,r)    at    (0*,r*)    reveals  that  the  change  in 

item  cost    c^    by  any  amount    Ac^    has  the  same  || 

effect  on  the  constraint  as  a  change  in  the  inves 

ment  limit    K,    of    -(r*  +  Q*/2  -  u.)Ac.   .  Since 
1  11  11 

the  quantity  in  parentheses  may  be  verified  frcm 

Tables  1  and  2  to  be  much  greater  than    1    for  al 

i  ,  it  follows  that  the  effect  of  changing  any  c 

by  any  increment    6    will  be  much  greater  on  the 
constraint  (and  hence,  on  the  optimal  value  Z* 
since  the  constraint  is  binding)  than  the  effect 
changing  by  the  same  amount    6  .    This  inpli 

for  each    i    and,  ir 

fact,  it  can  be  shown  here  that  3Z*/9c. 


that     |3Z*/3c^|  >  |3Z*/3Kj^ 


=  -(r|  +  Q|/2  -  u^)3Z*/3K-|^  /  so  that  the  relatior 

ships  indicated  are  indeed  precisely  verified. 

The  above  observations  might  also  suggest  the 
sane  care  must  be  taken  in  interpreting  the  resu] 
Changes  in  the  parameter  associated  with  the  larc 
est  (in  absolute  value)  partial  derivative  will 
give  the  greatest  local  change  in  the  optimal  vaJ 
of  the  objective  function,  compared  to  a  change  c 
the  same  magnitude  in  any  other  parameter  taken  i 
individually.  This  follows  because  either  the 
objective  function  and/or  some  of  the  constrainti 
(as  above)  are  nost  significantly  affected  by  thj 
parameter  change  at  the  current  solution.  Generr 
rules  have  not  been  given  for  selection  of  optinv 


changes  in  the  parameters,  i.e.,  for  determining 
the  optimal  magnitude  and  combination  of  such 
changes.  It  is  well  beyond  the  scope  of  this  paj, 
to  pursue  this  "nacro-analysis"  determination, 
though  it  should  be  noted  that  the  greatest  locaJ] 
rate  of  decrease  in  the  optimal  value  function  i) 
along  the  direction  of  the  negative  of  the  gradif 
of  this  function  in  parameter  space  (i.e.,  along 
the  vector  conposed  of  the  negative  of  the  conpo-, 
nents  of  the  partial  derivatives  with  respect  to 
the  various  parameters) .  A  user  would  nonethelei 
have  to  determine  the  feasibility  of  this  direct' 
of  change  and,  if  feasible,  the  optimal  nove  aloi 
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Table  1 
INVENTORY  PROBLEM  DATA 


ITEM  i 

^UAN 1  -L  i  X 

1 

2 

3 

p 

1 1 

100 

200 

300 

A 

a  . 

100 

100 

200 

R 

1 

1 

10 

20 

i 

M 

1,000 

1,500 

2,000 

E 

i 

T 

^1 

$8,000 

E 

R 

15  re-orders/unit  tine 

S 

V 

ol 

600 

270 

300 

A 

R 

0 
r . 

200 

260 

400 

1 

(MEAN  OF  LEAD-TIME  DEMAND) 
(S.D.  OF  LEAD-TIME  DEflAND) 
(ITEM  UNIT  COST  -  DOLLARS) 
(MEAN  DEMANDAJNIT  TIME) 

( INVESTMENT  LIMIT) 
(RE-ORDER  WORKLOAD  LBIIT) 

(AfDUNT  ORDERED) 
(RE-ORDER  LEVEL) 


Table  2 

SOLUTION  AND  LAGRANGE  MULTIPLIERS 


Table  3 

OPTIMAL  VPJUE  FUNCTION  DERIVATIVES 


QUANTITY 

ITEM  i 

1 

2 

3 

V 
A 

* 

533 

246 

285 

R 

* 

r . 
1 

253 

277 

437 

L 

* 

.0552 

M 

* 

"2 

.6230 

V 

A 
L 

* 

Z 

12.987 

U 

E 

PARTIALS 

ITEM  i 

1 

2 

3 

-.0000 

-.0003 

-.0008 

3Z*/3o- 
'  1 

.0119 

.0897^ 

.1729^ 

2.1713^ 

1.0345^ 

1.4452^ 

3Z*/3A . 
1 

.0012 

.0025 

.0022 

-.0052 

-.6230^ 

^Deemed  "significant"  by  criterion, 
|a^Z*|/Z*  >  .001    for  a  unit  change  in  the 

given  parameter,  where    A^Z*    is  the  esti- 
mated first  order  change  in    Z*  .    This  cri- 
terion was  selected  arbitrarily  for  illustra- 
tive purposes.    Criteria  appropriate  to  the 
particular  application  can  be  selected  by  a 
user. 
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this  vector,  taking  into  account  other  factors  such 
as  the  relative  "cost-effectiveness"  of  any  sched- 
ule of  changes  in  any  irodel  parameter. 

Referring  back  to  Table  2,  we  note  that  the 
Lagrange  multiplier    u*    is  much  greater  than  u* 

Recalling  the  "sensitivity"  interpretation  of 
Lagrange  multipliers,  ^ich  holds  under  the  pre- 
sent conditions,  it  follcws  that    u*  =  -3Z*/3K-|^ 

and    u*  =  -azVSK^  .    This  conclusion  is  consis- 
tent with  the  result  obtained  in  Table  3,    and  it 
means  that  the  workload  constraint    g2    is  by  far 

the  more  effective  in  detemnining  the  minimum 
number  of  expected  time-weighted  shortages  at  the 
current  value  of  the  paraiteters,  e.g.,  a  small 
increase  in    K2    will  have  a  greater  effect  on 
reducing    Z*    than  a  small  increase  in  . 

Nonetheless,  a  user  must  again  simultaneously  con- 
sider the  corrparative  costs  involved  in  making 
finite  changes,  in  conjunction  with  their  ex- 
pected effects,  to  arrive  at  an  optimal  narginal 
inprovement  based  on  this  first  order  infornation. 
The  sensitivity  information  is  valuable,  but  re- 
quires some  care  in  exploiting. 

Table  4  gives  the  estimates  of  the  first  deri- 
vatives of  the  optimal  reorder  quantities    Qi  and 
reorder  points    r^    with  respect  to  each  of  tht 
problem  parameters.    This  is  extreraely  detailed  in- 
formation which  gives  an  indication  of  how  the  com- 
ponents of  the  solution  vector  itself  '"dll  change 
as  the  various  parameters  change.    In  particular, 
this  information  can  be  used  to  obtain  a  first  ord- 
er estimate  of  the  solution  vector  of  a  problem  in- 
volving different  parameter  \^lues,  having  obtained 
a  solution  for  a  given  set  of  parameters. 

The  partial  derivatives  of  the  Lagrange  multi- 
pliers with  respect  to  the  parameters  are  given  in 
j^le' 5."-Again,  ~tires'e~can"be"used  to  obtain  first 
order  estimates  of  the  Lagrange  multipliers  of  a 
problem  with  different  parameter  values.    In  par- 
ticular, the  relative  effecte  of  the  constraints 
on  th.e  optimal  value  of  problans  involving  differ- 
ent parameter  values  can  be  estimated.  Further- 
more, it  can  be  shown  that  the  partial  derivatives 
of  the  multipliers  with  respect  to  and 

yield  the  second  partial  derivatives  of  the  op- 
timal value  function  witli  respect  to  the  param- 
eters and    K2  ,  under  the  present  condi- 
tions.   Thus,  the  kind  of  information  given  in 
Table  5  can  be  used  to  provide  a  second  order 
estiiTBte  of  the  optimal  value  function    7^  For 
different  values  of  these  parameters. 

To  illustrate  and  test  the  application  of  the 
type  of  information  provided  here,  the  first  par- 
tial derivatives  with  respect  to    c-^    of  the  opti- 
mal value  function    Z*  ,  the  solution  conponents 

and    r^  ,  and  the  Lagrange  multipliers  u^ 

and    U2  ,  were  used  to  give  a  first  order 

(Taylor's  Series)  estimate  of  the  corresponding 
solution  values  associated  with  the  problem  where 
the  given  value  of    c-^    was  increased  by  one  dol- 
lar.   These  estimates  were  coirpared  with  the  re- 
spective values  of  the  solution  obtained  by  actu- 
ally solving  the  perturbed  problem.    The  results 
are  sumnarized  in  Table  6.    Though  the  perturba- 


tion is  large  (the  parameter  being  increased  by 
100%  of  its  current  value) ,  the  estimates  are  seei 
to  be  extremely  accurate  with  the  exception  of  thi 
estimated  reorder  quantity         .    Many  uses  could 

be  made  of  the  estimated  solution;  e.g.,  should  i-. 
be  desirable  to  solve  the  perturbed  problem  accur- 
ately, it  would  be  ccmputationally  extrenely  advai 
tageous  to  use  the  estimated  solution  as  a  startii 
point . 

The  conplete  exploitation  of  this  sensitivity' 
analysis  information  now  available  will  depend 
largely  on  user  interest  and  ingenuity. 


Table  4 
SOLUTION  POINT  SENSITIVITY 


PARTIALS 

ITEM  i 

1 

0 
z 

30  /3K 

-  47.3187 

-  18.7610 

-  14.906! 

3r^/3K2 

5.2265 

6.1961 

9.967. 

3Q^/3Cj_ 

-208.8688 

15.3140 

14.375: 

3r^/3c^ 

-  31.7918 

-  10.3425 

-  20.002: 

3Qj_/3a2 

.8469 

.2271 

.  108: 

3r^/3a2 

.1273 

1.0337 

_  .491 

3Q^/3C2 

8.1908 

-  4.9719 

3.852 

3r./3a_ 
1  2 

-  2.6783 

-  7.2676 

-  7.108 

30^/3 

-  1.2374 

.4033 

.584 

3r^/3o2 

.2611 

.3839 

.044 

30^/30^ 

.1670 

.4442 

.425 

3r./3c_ 
1  3 

-  2.3072 

-  3.1702 

-  12.152 

Table  ,5 
L.n.  SENSITIVITY 


OP.Tir-lAL  L.M.  DERIVATIVES 

PARnAI£ 

COSISTPAINT  i 

1:  INVESTMENT 

2 :  TORKLOAD 

3u.*/3K„ 
1  2 

-.0002 

-.1382 

1 

3u.*/3c, 
1  1 

.0006 

.1635 

3u^*/3cr2 

.0000 

.0006  i 

1 

3u^*/3C2 

.0002 

.0489  1 

3u.  V3a-, 
1  3 

.0000 

.0020  J 

3u^*/3C2 

.0003 

.0323  1 
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Table  6 

FIRST  ORDER  ESTItlATES  FOR  A  UNIT 
INCREASE  OF  PARAT-ETER  C]^ 


QUANTITY 
F(e-^) 

ESTDIATE 

A/^fTTTAT 

ACiuALr 

■6  AtSo  . 

ERROR 

Z*(e"^) 

15.159 

14.996 

1.08 

324 

412 

21.36 

) 

221 

229 

3.49 

261 

257 

1.56 

267 

268 

.37 

n  1 F^) 
^  ' 

297 

.  67 

All 

420 

.71 

1  In 
u^(e  ) 

.0058 

.0057 

1.75 

U2(e  ) 

.7865 

.7671 

1.94 

F{e°)  +  (£-'- 

-eO)Tv  F(eO 
e 

) 

F(e°)  +  (1) 

3F(e°)/3c^ 
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The  Generalized  Inverse  in  Nonlinear  Programming  — 

Equivalence  of  the  Kuhn-Tucker,   Rosen  and 
Generalized  .Simplex  Algorithm  Necessary  Conditions* 

L.   Duane  Pyle 
University  of  Houston 
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Introduction 


The  eauivalence  of  the  Kuhn-Tucker 
and  the  Rosen  Gradient  Projection  condi- 
tions for  optimality  was  established  by 
Manaasarian   [  4]  /  who  considered  the 
following  nonlinear  programming  problem 
formulation: 

Maximize  f(x) 


(1.1) 
where      H(x)  = 


and       (x) , 


h^  (x) 


m 


>   e,  xeE 


,   h    (x)    and  f  (x)  are 
m 


concave,  dif f erentiable . 

(9  IS  a  vector  of  zeros.) 

Except  for  minor  changes  in 
notation,   the  follov/ing,  alternative, 
nonlinear  programming  problem  formu- 
lation is  discussed  by  Luenberger  [3] 

Minimize  f (x) 


where 


(1.2) 


H{x)  = 


h^(x) 


h^(x) 


G(x)  = 


5s+l(^) 


5s+t^^) 


>   e,  xeE 


and  h^(x),  ...  ,  (x) ,  gg+i (^)" • • ' ^g+t' 
and  f{x)   are  dif f erentiable . 


If  X  is  a  point  satisfying  the 
constraints 


(1.3) 


H(x)  = 


G(x)  > 


where 
(1.4) 


g^ (x)   =  0     for     j  £ 


g .  (x)    >   0     for     j  E J 
3  ^ 


and  J-[^J2  ^  {s+l' 


s+t} 


then  the  constraints  h^(x). 


,   h^(x)  , 


:or 


and  gj  (x)    for  jeJ-^,   are  said  to  be  activ 

at  X   ;   the  constraints  g. (x)    for  jeJ,  ar 

said  to  be  inactive  at  x.  A  point,  x, 
satisfying  (1.3)  is  said  to  be  a  regular 
point  for  the  constraints  (1.3)  if  the  ^, 
gradients  of  the  constraints  active  at  x 
are  linearly  independent. 

The  Kuhn-Tucker  Conditions    [ 2]  are 
satisfied  at  the  point  x  for  the  nonline 
programming  problem   (1.2)    if  there  exist 

vectors  w'"*"^    and  w'^'    such  that 


(1.5) 


Vf(x)    +  [Vh^(x) 
+  [Vg^^^(x) 


,  Vh^(x)]  w 
vg3^^(x)]  . 


(1) 
,(2). 


where    (w*^',   G(x))   =  0,  w^^^    <   6  and 


Vv  (x) 


3v 

(x) 


3xi 


3v 
3x 

u  ^ 


(x) 


Theorem  1.1; 


Luenberger   [ 3]   proves  the  following 
(necessary  condition) : 

Let  X  be  a  relative  minimu 
point  for  the  problem  (1.2 
and  suppose  ^  is  a  regular 
point  for  the  constraints  i 
(1.3),  then  the  Kuhn-Tucke. 
conditions  (1.5)  are  satis 
fied  at  x. 
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1 


Let 

6C7(X)     =     [Vh     fx),     ...     Vh^(x),  Vg. 

...    ,    Vg .    (x) ]   be  the  n  by  (s+p) 
p 

iiatrix  whose  columns  consist  of  the 
radients  tothe  constraints  which  are 
ctive  at  x;    that  is,    Jj   =   ijj,  j2'---'jpJ- 

'he  Rosen  Gradient  Projec  tion  Conditions 
10],  [11]   are  satisfied  at  the  ooint  x  for 
he  nonlinear  programming  problem   (1.2)  if 


[I  -   a  (X)  a(x)  ]    Vf  (X)  = 


1.7) 


and 


[{al"(x))^  vf{x)]j  >  0  for  all  j  e  , 
where  cC  is  the  generalized  inverse  of 

a  [4] . 


2 .       Equivalence  of  the  Kuhn-Tuc]<:er  Condi- 
tions and  the  Rosen  Gradient  Projec- 
tion Conditions 

Suppose    (1.7)   holds  at  some  point  x 
which  is  regular  for  the  constraints  (1.3). 
Then 

(2.01)  [I  -  a'^(x)  a(x)]    Vf(i)  = 

[I  -  6^  (x)  (C(!"(x)  )T]    Vf  (x)    =  6 
implies 

(2.02)  Vf(i)   +       ix)    [- (Ct  (x)'^Vf  (x)  ]   =  e 


(2.03) 


Define  w 


(1) 


( 2 ) 

w       as  follows 


,(2) 


=  -(a  ' 


where  w'"^'  has  s  elements  and  w'^'  has 
p  elements.     Then  from 


The  expressions  analogous  to  (1.7) 
ormulated  by  Rosen,   Mangasarian,  and 
ruenberger  may  be  obtained  by  making  use 
f  the  relation 

1.8)  fa.^(x)T=  mx)    {x)]~^ au) 

s,    for  example,  in 

1.9)  I  -  ct{x)aj.k)  =  I  -(a^(K)  (dCx))'^ 
=  I  -of  (x)  [a(x)a  (X) ]"^a(x) . 


Relation   (1.8)   provided  the  basis  for 
he  computational  approach  originally  pro- 
osed  by  Rosen   [10]^,  who, gave  a  procedure 
or  "updating"    [^X(x)      (x)  ]  ~1 .  Alternatives 
proposed  by  Gill  and  Murray    [  1  ]    and  Pvle 
'  7  ]  ,    [  8  ] ,    19  ]  ,   employ  an  orthonormal 
oasis  for  the  null  space  of  (X  in  obtaining 
I  representation  for  I  -  (X^C^.  Motivation 
:or  the  developments  given  in    [  9  ]  was 
srovided  by  the  results  of  numerical 
ixperiments,   involving  randomly  generated 
-inear  programming  problems,  interpreted 
.n  the  context  of  the  geometry  of  the 
simplex  algorithm.     This  interpretation 
las  led  to  the  extension  and  refinement 
>f  the  results  given  in    [  7  ]    and   [  8  1  as 
•eported  in    [  9  ]  . 

In  this  paper  two  results  are  pre- 
ented.     The  first,   provided  for  complete- 
ess,   is  an  application  of  Mangasa rian ' s 
pproach,   demonstrating  the  equivalence 
f  the  Kuhn-Tuclcer  and  the  Rosen  Gradient 
rejection  Conditions  as  formulated  for 
'he  problem   (1.2).     The  second  result 
stablishes  the  equivalence  of  the 
osen  Gradient  Projection  Conditions 
1.7)   and  certain  conditions  which  result 
rom  a  natural  extension  of  a  generaliza- 
ion  of  the  simplex  algorithm   [9]    to  the 
onlinear  programming  problem. 


(2.04)      [  (a;:(x)  )'^-  Vf  (x)  ]  .    >   0  for  all  jeJi 
-  (2) 

it  follows  that  w     '    <  e.     Now  define 


(2.05)  w 


(2) 


,(2) 


(2) 


where  w^"^'   has  t  elements.      (Note:  G(x) 
in   (1.2)    is  composed  of  t  functions) 

w'^'    <  e   implies  w'^'    <  6,   and  from  (2.02) 
and   ('2*.  03)  and  the  form^of  w(2)    it  follows 
that 


(2.06)    Vf (x)   +  a^(x) 


(1) 


(2) 


Vf  (x)   +    [Cf  (x)  ,?F(x)  ] 


w 


,(i)-r 

(2) 


where       (x)    is  a  matrix  of  the  gradients 
Vg . (x)    for  jeJ2'   corresponding  to  inactive 
constraints  at  x.      (2.06),   together  with 


(2. 07)    (w*^^ ,   G(x) ) 


(2) 

where  w  < 


G(x) )    =  0 


I,   is  (1.5) 


Now,   suppose    (1.5)   holds  at  some 
point  X  which  is  regular  for  the  con- 
straints   (1.3).     For  simplicity,  we 
adopt  the  notation 


(2.08) 


1 


=  s  +  1, 


jp  =  s  +  p 


where    Jj  =   (jif    •••    t  ip-^- 

Letting  w^"""^  and  w*^^  denote  the  vectors 
which  satisfy    (1.5),   the  relations 

and 


(2-09)     ^(2)    ^         g(^)  , 


(W'2)  ^     Q(x)  )      =  0, 

taken  together,   imply  that 
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(2.10)  ;(2) 

"s+p+1 


thus 


,(2) 


,(2) 


(2) 


where  w 
From  (1.5) 


and  w     has  p  elements. 


(2. 11)    vf (x)   +   [vhj (x) 


Vhg(x)]  w-^ 


-  (2) 

+  [Vq^^^{x)...vg^^p(x)]w^ 


or 


(2.12)    Vf  (x)    +  CL  (x) 


(1) 
(2) 


Since  x  is  a  regular  point  for  the 
raints    (1.3),  CI  (x)   has  full  column 


constraints 
rank,  thus 


'2-13)    (a^(x))+  =  [a(x)a^(x)]"ia(x)  = 

((X+(x))T 
therefore , 


^'•^'^    (a"(x))^  .fih  + 


or,    since  w^^^  < 


w 


,(1) 
(2) 


(2.15)      [(a+(x))^  Vf(x)]j   >   0   for  all  jeJi. 


Solving    (2.14)  for 


-  (  2 ) 


and 


substituting  in    (2.12),  obtain 

Vf  (x)  -  Q^ix)  idih)'^  Vf  (i) 
(2.16)  ,    „  . 

=    [I  -  a  (x)a.(x)  ]    Vf  (x)    =  e. 
(2.15)    and    (2.16)    are  the  Rosen  Gradient 
Projection  Conditions    (1.7),  formulated 
for  the  nonlinear  programming  problem  (1.2). 


3 .       Equivalence  of  the  Rosen  Gradient 

Projection  Conditions  and  the  Gener- 
alized Simplex  Algorithm  Conditions 

Consider  the  following  nonlinear 
programming  problem  formulation: 


where 
H(x) 

(3.01) 

G(x) 


Minimize  f (x) 
hi  (x) 


h^(x) 


V(x) 

X 


V(x)  = 


V    ^  (x) 
s+r 


xzE 


and  hj  (x)  ,    ...    ,   h^  (x)  ,  v        (x)  ,    ...  , 

V    ,    (x)    and  f (x)    are  dif f erentiable . 
s+r 

(Note  that  any  problem  of  the  form  (1.2) 
may  be  reformulated  in  the  form  (3.01).) 

It  is  understood  that  V(x),    if  not 
empty,   consists  of  all  the  "coef f iciented" 
inequality  constraints;    that  is,  v^  (x)   ^  x^ 

for    (j  =  1,    2,    ...    ,  n). 

Suppose  X  is  a  regular  point  for 
(3.01)   and,   for  notational  simplicity, 
that  the  active  constraints,  Vj  (x)  from 


the  set  V (x)  = 


V,  (X) 
(x) 


consist  of  the 


constraints  v^(x)    for    (i  = 


+  1 ,  .  .  .  ,   m)  , 

and  that  the  active  constraints  from  the 
set  I  X  =  X  ^  6  are  the  constraints 

(u*^^,   x)   =  0  for    (j  =  m+k+1,    ...    ,   n)  , 

where    (k  =  0,    1,    ...    ,    (n  -  m) ) ,   and  I  is 

the  n  by  n  identity  matrix  composed  of 

r    (1)        (2)  (n)  , 

unit  vectors   iu       ,   u       ,    . . .    ,   u       } . 

(Note:     The  special  case  k  =  n  -  m,    j  = 

n+1,   n  corresponds  to  the  situation  where 

X j   >   0  for  j  =  1,    2,    ...    ,  n.) 

Define 

(3.02)  A^(x)   =    [Vh,{x)...vh    (x)Vv.^,  (x)  .  .  . 
"  S  '  s+1 

^^m  ^     ^ n  by  m  matrix  whose  columns  con- 
sist of  the  qradients  to  the  "coef f icient- 
ed" constraints  which  are  active  at  Xj  , 

(3.03)  c (x)   =  Vf (x). 


(3.04)  b(x) 


[A(x)  ]x, 


and  then  denote  the  columns  of  A(x)  as 
follows : 


P^(x)] 


(3.05)      A(x)    =    [Pi (x)  . 

Consider  the  following  linear  program- 
ming problem  approximation  of    (3.01)    at  x: 
Minimize    (x,  c(x)) 


(3.06)  where 


[A(x)  ]x  =  b(x) 

X  >  e . 


The  developments  given  in   [9 ]    for  the 
linear  programming  problem  may  be  applied 
to  (3.06): 


In  summary,  letting 
(3.07)     B(x)   =    [P, (x)    . . . 


] ,   X  a  regu- 


..  P„(x) 

m  y^ 

lar  point  for  (3.01)  implies  B(x)  is  non- 
singular.  Forming 
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(3.08)  [B(x)]' 


A(x)    =    [I  P^^-^M  ■  .  P^i^)] 
(n+il 


obtain  an  orthonormal  basis,    {l^  (x)  } 

for    (i  =  1,    2,    ...    ,   k) ,    for  the  subspace 
in  E^^  which  is  spanned  by  the  linearly 
independent  coluinns  of  the  following 
matrix : 


(3.09)     F(x)  = 


R(x) 

I 

0 


where  R(x)    =  -    [P    , ^  (x) . . .P       (x) ]    is  m  by 
m+1  m+K 

k,     I  is  k  by  k  and  0  is    [n- (m+k) ]    by  k. 

The  Rosen  Gradient  Projection  Condi- 
tions   (1.7),    formulated  for  problem  (3.01) 
in  accordance  with  the  notational  conven- 
tions assumed  above,    involve  the  matrices 


(3.10)     a(x)  = 
where  Xj'^  =  [ 


■a(x) 

.  u 

(m+k+1) 


and  1-0.  {x)a(x) 
(n)  , 


*  *  «  c  o 


Now,  Cl(x)   F (x)   =  O^by  construction 
and  the  k  columns  of (x)    form  a  basis  for 
the  null  space  of  Qix) ,  thus 

k       (m+i)  (m+l) 
(3.11)      I-Ct(x)ft(x)   =  V    n    (i)    n  (X) 

1+1 

The  developments  given  in   [  9]    are  thus  seen 
to  provide  an  alternative  method  for  ob- 
taining the  orthogonal  projection  matrix 

[I  -  a"^(x)  (Xix)]  . 
The  equivalence  of  the  Rosen  Gradient 
Projection  Condition,    [  (-O"  (x)  )  ^Vf  (x)  ]  .   >  0 

and  the  analogous  construction  obtained 
using  the  generalization  of  the  simplex 
algorithm  given  in   [9] ,    follows  from 
theorem  3.1. 


Theorem  3.1; 


Let 


•  (3.12)  =    (a  (x)i3.(x)c(x)  , 


B-l( 


wiiere 


and 


j  =  1,  2, 
k  =  1,  2, 


m 
(n 


(3.13)   Cj=    (Ct  (x)a(x)c  (x)  , 


m+l) 
-P  . 

61  . 


where       j  =  m  +  k,    ...    ,  n 

k=l,    2,    ...    ,  (n-m) 
and  the  "j"th  element  in  the  vector 


-P 

j 

1  i 


is  unity. 
Then 
k 

(3.14)    c  .    =    [  ict  (X)  )'^  c  (x) 


for     j  =  1 ,    2 ,    ...    ,  m 

k=:l,    2,    ...    ,    (n-m  +  1) 

and 

k 

(3.15)      Cj  =    [(a^(x))^  c(x)]  ^j^^^ 


for  j  =  m  +  k, 
k  =  1,  2, 


,  n 
,  (n-m) 


Proof:     Upon  substituting  ^"^(x)  {  (ji  (x) 

for  Cl+ (x)(X{x)    in    (3.12)    and  (3.13) 
respectively , 


obtain 


(3.16)    c.=    iO^  [x)  (Ot  (x))'^c  (x)  , 


B  ^  (x)u  ^ 


■"a  -'■(x)u' 


=    ((^(x))  c(x),a(^) 
=    (  i.Otix))^cU)  ,  fu*^'  ]  )  = 

[(.a  (x))^  c(x)] j 

where     j  =  1 ,   2 ,    ...    ,  m 

k=l,    2,    ...    ,    (n-m-l)  , 
which  is    (3.14);  and 


{3.17)Cj  =    (a'^(x)  (0t(x)  )'^c(x) 


=  ( (a^(i) )^c(x) , a (X) 


-P  . 

D 


=  (  iCt  (x))'^  c  (x)  ,  u 
=   [(Ol{x))'^  c(x)]  . 


(j) 


j-k+1 

where     j  =  m  +  k,    ...    ,  n 

k  =  1 ,   2 ,    ...    ,    (n-m)  , 
which  is  (3.15). 
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Remarks : 


and 


(i)   The  vectors 


■    _  /^ 

-Pj (x) 

-B 

61  . 

: 

1 

62  . 

: 

■1 


both  involve  B     (x) ,   and  this  permits  com- 

values  by  a  Revised 


Dutation  of  the  — 
c 


Simplex  Algorithm  approach,  as  described 
in^ [  9].     Also,   see   [  5]  where  the  matrix 
F(x)    is  used  directly  in  implementing  the 
method  of  reduced  gradients. 

(ii)  In   [  9]   the  vectors   {n }  for 
(i=l,    ...    ,   k)  were  obtained  sequentially 
by  an  application  of  the  Gram-Schmidt 
orthogonalization  procedure.     From  (3.12) 
and    (3.13),    it  follows  that  the  individual 

vectors,    n         '  ,   are  not  required  in  realiz- 
inq  the  Rosen  Conditions,   thus  any  other 
-nethod   (e.g.    [1])  for  generating  the 
orthogonal  projection  havinq  a  range  equal 
to  the  column  space  of  F (x)   could  be 
utilized . 

(iii)  Note  that  in    (3.12)    and    (3.16),  u^^' 
denotes  an.    m  by  1     unit  vector,  whereas  in 
(3.17)    u  -'     denotes  an     n  by  1     unit'  vector. 
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TEACHING  MATHEMATICAL  PROGRAMMING 
TO  THE  CONSUMER 


Marshdll  L.  Fisher 
Decision  Sciences  Department 
The  Wharton  School 
University  of  Pennsylvania 


Most  courses  on  mathmatical 
programming  are  taught  by  producers 
(i.e.,  researchers  who  are  develooinq 
mathematical  programming  technology)  to 
potential  consumers  of  that  technology. 
The  pedagogical  approach  that  is  most 
natural  for  a  producer  may  not  be  at  all 
natural  for  a  potential  consumer,  a' fact 
that  explains,  in  part,  why  many 
potential  consumers  fail  to  become 
actual  consumers.  This  paper  '  is 
concerned  with  the  particular  needs  of 
the  consumer . 

I  will  provide  a  number  of  specific 
suggestions  for  course  content  and 
design.  These  suggestions  are  based 
primarily  on  my  experiences  teaching  a 
fundamental  course  on  mathematical 
methods  for  decision  making  offered  by 
the  Decision  Sciences  Department  of  the 
Wharton  School.  This  course,  which 
devotes  10  weeks  to  linear  and  integer 
programming,  is  taken  by  about  200  of 
the  600  students  that  enter  the  M.B.A. 
program  each  year.  The  backgrounds  of 
the  students  vary  widely.  This  year  my 
two  sections  had  a  total  of  70  students 
that  included  12  math  majors,  30 
engineering  and  science  majors,  18 
management  and  economics  majors  and  10 
majors  in  various  non-technical  areas 
like  English,  Anthoropology ,  Philosophy 
and  Language.  Most  of  these  students'do 
not     currently      plan  to  become 

specialists  in  Mathematical  Programming. 
Many  have  not  yet  decided  on  a  major, 
but  of  those  who  have  31  plan  to 
specialize  in  Finance. 

Those  who  preach  mathematical 
programming  should  also  practice  its 
tenets.  Certainly  the  viewpoint  of 
constrained  optimization  provides  a 
useful  conceptual  framework  for  the 
problem  addressed  by  this 

paper — designing  a  course  'on 

mathematical  programming  for  a  specified 
purpose.  I  will  use  this  framework 
informally  in  developing  my  suggestions. 
I  will  take  as     an     objective  'that  the 


course  should  provide  maximum  benefit  to 
tne  future  careers  of  the  students. 
Benefits  can  range  from  changing  the  way 
the  students  think  about  their  work  to 
motivating  and  enabling'  them  to 
formulate  and  solve  mathematical 
programming  models  of  problems  forced  in 
their  careers.  Constraints  are  imposed 
on  the  activities  we  can  perform  to 
optimize  this  objective  by  the  limited 
teaching  time  available,  by  the  students 
diverse  and  in  some  cases,  limited 
mathematical  backgrounds,  and  by  their 
long  term  career  plans  that  frequently 
do  not  place  primary  importance  on 
technical  expertise  in  mathematical 
programming . 

I        will         now  list  seven 

r ecommentations  for  'the  design  of  a 
course  that  "solves"  the  "optimization 
problem"  I  have  just  outlined.  All 
suggestions  are  admittedly  subjective, 
but  I  believe  they  follow  logically  from 
the  framework  I  have  niven. 

1.  Methodolog  ical  topics  should  be 
selected  to  maximize  the  ratio  of  theTF 
short  r un  benefits  to  the ir  cost . 

By  'short  run  benefits'  I  mean  the 
ability  to  operationally  solve  real 
problems  and  by  'cost'  I  mean  the 
intellectual  prerequisites  and  time 
requirements  to  understand  the  topic. 
This     criteria  leads  to  the  selection  of 

topics  like  linear  programming, 
seperable  programming,  heuristics  and 
branch  and  bound  methods  for  integer 
programming  models,  and  the  special 
algorithm  for  transportation  problems. 
It  requires  the  exclusion  of  many 
fascinating  topics  in  nonlinear  and 
integer  programming  that  (a)  would 
require  too  much  time  to  teach  or  (b) 
have  too  great  a  mathematical 
prerequisite  or  (c)  lie  on  the  frontier 
of  research  and  thus  have  not  'yet 
demonstrated  their  practical  usefulness 
or  (d)  have  all  of  the  above  drawbacks. 
I  would  list  as  borderline  cases 
Lagrangian     relaxation     and     an  informal 
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treatment  of  the  analysis  of  heuristics 
and  algorithms  for  combinatorial 
optimization  problems. 


Alqor  ithms  shoald  be  taaght 
that  conveys  their  logic"  bat 


way  

techn ical i t le  s : 


in  a 
avoid  s 


One  good  method  for  doing  this  is 
to  simply  demonstrate  the  algorithm  on  a 
small  examole  derived  from  a  real 
application.  The  students  usually  have 
an  intuition  about  the  example  that  can 
be  used  to  explain  the  logic  of  the 
algorithm.  This  excercise  can  be 
followed  with  a  verbal  statement  of  the 
algorithm  for  the  general  case. 

A  variation  on  this  approach  is  to 
present  the  students  with  an  examole  and 
ask  them  to  use  their  common  sense  to 
develop  a  solution  procedure. 

Frequently  the  students  will  be  able  to 
discover  the  essential  features  of  an 
algorithm  themselves  and  at  a  minimum 
this  experience  provides  a  good 
perspective  for  understanding  a  more 
formal  presentation  of  the  algorithm. 
For  example,  when  confronted  with  a 
problem  of  shipping  amounts  of  a  product 
from  several  supply  points  to  several 
dfcjmand  points  most  students  will  auickly 
devise  a  tableau  format  for  recording 
possible  solutions  and  discover  a 
feasible  solution  using  some  varient  of 
the  Northwest  corner  rule.  With  a 
little  coaching  they  will  make  improving 
adjustments  in  this  solution  until  an 
optimal  solution  is  obtained.  It  is 
then  only  necessary  to  show  them  that 
their  adjustment  procedure  can  be 
systemized  and  to  provide  an  optimal ity 
test  and  proof. 


3.  Use 

appl ication 
mathematical 


articles 


that 


success  stor  ie s  to 
pr og r amm  ing  can  be 


r  epor  t 
show  how 
used  ; 


The  increased  emphasis  on  the 
publication  of  solid  application  work 
has  produced  a  wealth  of  articles  that 
are  highly  suitable  "for  classroom 
discussion.  Journals  "like  Inter  faces 
are  ideal  sources  for  articles  of  this 
type.  Recent  examples  of  applications 
articles  in  this  journal  include  a 
d iscussion" of  inventory  planning  for  a 
mining  company  (Reddy[6])  and  a 
description  of  how  to  win  an  election 
with  linear  programming  (Southwick  and 
Z ionts [ 7 ] ) -an  article  that  is  sure  to  be 
a  hit  in  any  year  divisible  by  4. 
Hanagemen t  Sc ience  ,  Operations  Research 
and  the  Harvard 
not  to 


Business  Review  are  also 
ar  tides 


be  overlooked  for 
suitable  for  this  purpose.  The  paper  by 
Klingman,  Randolph  "and  Fuller [4]  is 
highly  recommended  for  a  real-life 
illustration  of  branch  and  bound  methods 
combined     with  transportation  or  network 


flow  models  for  a  location  analysis 
problem.  Geoffrion[2]         gives     "  an 

excellant  management-oriented  discussion 
of  integer  programminq  methods  for 
warehouse  location. 


4.  Dse 
skills. 


case  s     to       develop      model ing 


The  ability  to  construct  a  model  of 
a  complicated  process  is  a  skill  best 
acquired  through  practice.  Case  study 
problems  are  an  excellent  way  for 
providing  students  with  this  practice  if 
the  cases  are  appropriately  selected. 
The  case  should  present  a  complicated, 
messy,  and  apparently  baffling  decision 
problem.  At  the  same  time,  the  problem 
should  have  an  underlying  structure  and 
sufficient  data  should  be  included  to 
allow  the  student  to  discover  and  model 
the  structure  and  eventually  arrive  at  a 
satisfying  resolution  of  the  original 
problem.  The  feeling  of  having  "created 
order  out  of  apparent  chaos  that  a 
student  experiences  from  analyzing  such 
a  case  is  the  single  thing"most  likely 
to  'turn  him  on'  to  the  use  of 
mathematical  programming. 

Errors  "of  two  types  should  be 
avoided  in  the  selection  of  cases.  Many 
cases  are  too  structured,  really  just 
expanded  formulation  exercises.  Others 
that  "have  been  designed  for  policy 
discussion  are  too  unstructured  and 
don't  provide  enough  data  for  the 
student  to  sink  his  teeth  into.  Good 
sources  for  cases  that  strike  a  balance 
between  these  extremes  are  books  like 
Berry  and  Whybark[l]  and  Von 
Lanzenauer [8] .  I      have       also  had 

considerable  success  with  cases 
developed  from  my  own  or  another  faculty 
member's  consulting  experiences. 

Students  in  the  Wharton  course 
described  earlier  analyze  cases  in 
teams.  Each  team  is  selected  to  have 
some  students  with  technical  backgrounds 
and  some  with  non-technical  backgrounds. 
The  results  of  each  team  analyses  are 
presented  in  a  written  report  and  an 
oral  report  to  the  class.  The  Wharton 
Communications  Consultants,  a  service 
group  at  Wharton,  orovide  expert 
coaching  and  video  taping  equipment  to 
assist  the  teams  in  their  oral 
presentations . 

5.       Prov  ide  computer  codes  so  that  case 


analyses  may  be  pur  sued  to  completion 


The  computer  codes  should  be  easy 
to  use  yet  still  solve  reasonably  large 
problems.  "  Ideally  they  should  be 
available  on  an  interactive  computer. 
We  have  had  excellent  success  with  the 
package  of  programs  described  in  Land 
and     Powell [5].       The     package     has  an 
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simple  input  and  output,  provides 
linear,  integer,  and  parameter 
programming  capabilities  and  is  provided 
to  the  students  interactively  on  a 
DEC-10  computer. 

6.  Br  ing  in  Outside  ' Speaker s  to 
prov ide  a  feel   for  the   'real  world . ' 

Staff  OR  groups  and  '  consulting 
organizations  are  frequently  involved  in 
interesting  applications  of  mathematical 
programming  and  they  are  auite  happy  to 
talk  about  them.  Freauently  these 
organizations  are  also  involved  in 
recruiting  efforts  on  campus  and  view 
speaking  to  classes  as  a  desireable 
supplement  to  these  activities.  They 
can  provide  breadth,  background 
information,  and  a  feel  for  the  real 
world  that  is  unavailable  from  other 
sources.  Examples  like  that  of  one 
company  which  routinely  solves  a  50,000 
variable  10,000  constraint  LP  model  of 
its  corporate  activities  have  a  dramatic 
impact  and  provide  additional  evidence 
that  mathematical  programming  is  not 
only  useful  but  used.  Ideally,'  the 
particular  speakers  chosen  should  be 
related  to  the  applications  articles  and 
cases  used . 

7.  "  Emphasi ze  the  intuitions  about 
complex  systems  ' that  mathemat  ical 
programming  can  r eveal . 

Complex  systems  "  are  usually 
difficult  to  understand  and  an 
unsystematic  approach  freauently  leads 
to  intuitions  that  are  'incorrect. 
Mathematical  programming  models  and 
algorithms  can  enhance  ones  intuition 
about  a     conolex     system     if     the  'model 

solution  is  dissected  'a  posteriori'  to 
determine  why'  a  given  answer  was 
obtained  (see  Geoffrion[3]  for  a 
discussion  and  examples  on  this  point). 

There  are  at  least  two  reasons  why 
a  manager  should  develop  good  intuitions 
about  a  system  he  manages.  Firstly,  if 
he  has  the  intuition  to  understand  why  a 
particular  model  solution  was  obtained, 
then  he  will  'have  more  faith  in  the 
solution.  Secondly,  the  actions 
prescribed  by  the  model  solution  may  be 
only  a  fraction  of  the  required 
decisions  that  relate  to  the  system 
being  analyzed.  Frequently,  managers 
must  also  make  many  related  operating 
decisions  that  were  too  numerous  and 
detailed  to  include  explicitly  in  the 
model.  Moreover,  even  for  decisions 
which  are  included  in  the  model,  the 
results  must  often  be  manually  adjusted 
to  allow  for  intangible  aspects  of  the 
system  that  were  not  included.  If 
managers  have  used  '  mathematical 
programming  methods  to  obtain  a  better 
intuitive  understanding  of  their  system. 


then  they  will  do  a  better  job  of  making 
these  additional  decisions. 

Students  will  appreciate  this 
phenomenon  if  they  are  lead  to  discover 
the  reasons  for  answers  obtained  in 
cases  and  applications  articles. 
Appropriate  examples  are  numerous.  For 
the  transportation  problem,  the  optimal 
shadow  prices  can  reveal  the  initially 
surprising  fact  that  shipping  more 
product  will  in  some  cases  reduce 
shipping  costs.  The  linear  programming 
model  described  in  Reddy[6]  lead  to  the 
discovery  that  customer  orders  should  be 
satisfied  with  relatively  low  grade 
product  even  though  this  generated  less 
revenue  per  sale.  '  This  '  policy 
completely'  contradicted  existing 
folklore  within  the  company,  but  was 
found  to  be  correct  because  of  the 
scarcity  of  high  grade  product. 

My  experience  has  '  been  that 
students  are  much  more  satisfied  with 
the  idea  that  mathematical  'programming 
methods  can  enhance  their  own  intuition 
and  reasoning  power  than  they  are  with  a 
viewpoint  that  portrays  an  algorithm 
like  the  simplex  method  as  a  giant 
wrench  with  which  they  can  go  around 
tightening  loose  corporate  nuts. 
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Abstract 

This  paper  provides  an  overview  of 
the  methods  employed  by  the  Harvard  Busi- 
ness School  in  teaching  linear  program- 
ing.    Specifically,   we  deal  with  the 
pedagogical  aspects  of  teaching  linear 
programming  fundamentals  to  predominatly 
nonmathematical  MBA  students.     These  stu- 
dents will  eventually  become  decision 
makers  in  government  and  business,  and 
since  many  of  their  decisions  will  re- 
quire the  use  of  complex  mathematical 
models,   it  is  crucial  that  these  indivi- 
duals thoroughly  understand  the  funda- 
mentals of  the  application  of  quantita- 
tive tools.     The  cornerstones  of  our 
pedagogy  are  presented:     1)     the  use  of  a 
flexible  interactive  linear  programming 
system,   2)     the  active  participation  of 
faculty  and  students  in  class  discussion, 
3)     the  simulation  of  semi-realistic 
situations    (cases)    in  which  linear  pro- 
gramming has  been  employed,   and  4)  a 
carefully  prepared  set  of  course  mater- 
ials.    With  these  elements,  we  believe 
that  learning  is  enhanced  and  that  the 
students  are  well  prepared  to  success- 
fully use  linear  programming  techniques. 
A  detailed  description  of  the  program 
is  provided. 

Although  quantitative  techniques 
have  become  an  integral  com.ponent  of 
most  MBA  curricula,  there  have  been  few 
published  papers  which  deal  with  the 
pedagogical  aspects  of  quantitative 
courses.     To  be  sure,   there  have  been 
numerous  articles  on  general  management 
science  education,   for  example,  the 
special  issue  of  Management  Science  on 
Educational  Issues    [2],  Wagner's  talk 
on  the  ABC's  of  OR   [6],   and  several 
papers    [1,    3,   5]   which  have  described 
the  content  of  various  MS/OR  and  related 
curricula.     By  and  large,   however,  these 
do  not  address  the  specific  issues  of 
pedagogy . 

Pedagogy  is  particularly  relevant 
today  for  several  reasons.     First  of  all, 
the  field  of  management  science,  includ- 
ing linear  programming,   is  entering  a 


maturation  phase  in  which  fundamental  con- 
cepts are  being  refined  and  polished.  Ap- 
plications are  assuming  an  increasingly 
important  role.     In  addition,  many  of  tom- 
orrow's decision  makers  who  will  assume 
positions  in  which  they  are  able  to  make 
use  of  analytical  tools  do  not  possess 
mathematical  background  or  inclination. 
Thus,  we  might  lose  important  opportuni- 
ties for  applying  management  science  if  we 
do  not,   or  cannot,   teach  non-quantitative 
decision  makers  how  to  properly  utilize 
these  analytical  tools.     See  Healy  [4]. 
If  management  science  is  to  continue  grow- 
ing, we  must  develop  reliable  methods  for 
communicating  its  ideas  and  tenets. 

The  lack  of  a  suitable  pedagogical 
framework  is  most  noticable  when  discuss- 
ing the  teaching  of  analytical  tools  to 
students  who  do  not  possess  mathematical 
background.     This  article  attempts  to  fill 
this  void  by  addressing  the  specifics  of 
teaching  linear  programming  fundamentals 
to  non-quantitative  MBA  students.  Herein, 
we  use  the  pedagogical  device  of  inter- 
active exposition;   a  technique  which  is 
repeatedly  employed  by  the  authors  in  the 
classroom.     Although  this  report  does  not 
strictly  follow  the   "case"   format,   it  does 
describe  the  way  in  which  linear  program- 
ming is  taught  at  the  authors'  institution, 
i.e..   Harvard  Business  School    (HBS) . 

The  primary  issue  which  is  addressed 
in  this  article  is:     how  can  linear  pro- 
gramming be  taught  so  that  it  will  be  an 
effective  planning  tool  which  will  be  1) 
understood  and  2)    used?     The  authors  be- 
lieve that  an  interactive  teaching  method 
promotes  substantially  more  usable  and 
longer  lasting  learning  than  the  tradi- 
tional lecture  method,   especially  among 
students  who  will  become  tomorrow's  gene- 
ral managers.     Often,   in  a  lecture- 
oriented  classroom,   the  norm  requires  that 
the  student  be  attentive  and  carefully 
transcribe  the  instructor's  words  from 
blackboard  to  notebook.     In-class  decision 
making  is  rarely  required.     This  is  reser- 
ved for  take-home  problem  sets  and 
(usually  only  after  hours  of  "cramming") 
exams.     Often  the  student  forgets  much  of 
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the  content  of  the  course  shortly  after 
the  exam  is  over.      In  our  experience, 
this  happens  less  frequently  if  the  stu- 
dent 

-  learns  by  analyzing  actual  or 
near-replicas  of  business  situa- 
tions in  which  linear  programming 
has  played  a  part, 

-  actively  participates  in  class 
discussions  of  the  analysis,  im- 
plementation,  and  related  mana- 
gerial issues,  and 

-  has  easy  access  to  an  interactive, 
on-line  computer  package  used  in 
analyzing  these  situations. 

Since  the  student  is  at  all  times  person- 
ally involved  in  the  learning  process, 
his  tendency  is  to  integrate  the  concepts 
discussed  and  thus  to  retain  them. 

The  remainder  of  this  article  will 
describe  the  linear  programming  segment 
of  the  required  quantitative  methods 
(Managerial  Economics)   course  taught  in 
the  first  year  of  the  MBA  program  at  Har- 
vard.    Section  1  will  give  a  short  over- 
view of  the  school's  objectives,  student 
characteristics,  and  structure  of  the 
first-year  program.     Section  2  will  de- 
tail the  initial  case  used  in  this  seg- 
ment   (the  durable  Sherman  Motors  case) 
and  the  pedagogy  involved  in  teaching  the 
day-long  introduction  to  linear  program- 
ming.    Section  3  will  describe  the  next 
four  sessions  in  which  the  students  are 
gradually  presented  with  more  involved 
and  realistic  situations.     Since  an  im- 
portant aspect  of  the  learning  experience 
is  the  interactive  linear  prograruning 
package  that  is  employed,   the  capabili- 
ties of  this  package  will  be  described. 
This  report  is  largely  descriptive  in 
nature  so  that  the  reader  can  fully 
appreciate  the  unusual  constraints  which 
arise  at  HBS  and  how  the  ensuing  diffi- 
culties are  overcome. 

1 .     Overview:     Philosophy  and  Structure 

The  Harvard  Business  School  MBA  pro- 
gram gives  men  and  women  training  for 
line  management  positions.     The  intent  is 
to  produce  generalists,  not  specialists, 
nor  technical  staff  personnel,   so  that 
neither  mathematics  nor  theory  is 
stressed.     For  example,   the  simplex  meth- 
od per  se  is  not  taught.  (Graphically, 
the  vertex-following  concept  is  demon- 
strated,  but  the  details  of  the  algorithm 
or  the  theory  are  not  discussed.)  The 
emphasis  is  instead  on  formulation  of  ap- 
propriate models  and  the  subsequent  anal- 
ysis of  the  economic  and  managerial 
implications  of  the  solution  results. 
The  rationale  for  this  is  clear:  our 
belief  is  that  general  managers  have  no 
need  to  know  how  to  pivot,  wheras,  in 
order  to  effectively  use  linear  program- 
ming,  the  interpretation  of  results  is 
essential.      (In  fact,   we  suggest  that  the 


reader  try  to  recall  the  last  time  at 
which  he  needed  to  perform  a  pivot  by 
hand! ) 

In  the  first  year  of  the  MBA  program, 
all  students  are   enrolled  in  the  same 
courses.     The  approximately  800  incoming 
students  are  divided  into  nine  sections 
each  of  which  will  remain  together  in  the  ■ 
same  room  for  all  their  classes  through- 
out the  first  year.     By  the  sixth  week, 
each  section  has  started  to  develop  the 
tight-knit  social  fabric  which  so  charac- 
terizes the  Harvard  Business  School  first- 
year  section.     This,   even  more  than  the 
well-known  case  method,  makes  participa- 
tory education  possible.     The  student, 
surrounded  by  peers  whom,    for  the  most 
part,   he  trusts  and  respects,   is  willing 
to  go  out  on  a  limb,  try  new  directions 
and  experiences.     The  class  size,  approx- 
imately 85,   insures  that  all  points  of 
view  will  be  represented  and  that  most 
important  issues  will  emerge. 

Students  come  from  a  wide  range  of 
geographical  areas  with  an  even  broader 
variety  of  backgrounds  and  experiences. 
This  diversity  also  helps  to  keep  class 
discussion  interesting.     The  majority  of 
the  class  enter  with  at  least  two  years 
of  business  experience.     There  is  no  math- 
ematics prerequisite  and,   typically,  pre- 
vious training  in  mathematics  ranges  from 
Ph.D's    (few)    to  none-since-tenth-grade 
(more  frequent) .     One  might  believe  that 
this  situation  might  inevitably  lead  to 
the  poorly-trained  few  being  lost  or  to 
the  highly-trained  becoming  bored,  but 
the  attributes  of  the  participatory 
learning  can  prevent  this.     For  example, 
in  the  first  or  second  week  of  the  year, 
one  of  the  students  with  a  higher  degree 
of  technical  expertise  will  inevitably 
make  a  remark  involving  a  complex  or  the- 
oretical concept  not  understood  by  most. 
He  might  then  be  asked  to  role-play:  one 
of  the  class's   "poets"  will  be  cast  as 
his   "boss"  and  ask  him  to  explain  his  com- 
ment.    Often  he  tries  futilely.  This 
drives  home  the  lesson  that  the  ability 
to  explain  one's  technical  methods  to 
those  with  less  mathematical  training  is 
as  crucial  in  the  business  world  as  are 
those  methods  themselves.     As  the  year 
progresses,   the  different  learning  roles 
become  more  clear:     the   "poets"  concen- 
trate on  content  -  new  concepts  and  tech- 
niques;  the   "engineers"  refine  and  I 
abstract  and  learn  to  communicate  their 
knowledge . 

The  core  quantitative  methods  course 
consists  of  five  segments  taught  over  a 
six-month  period:     decision  analysis 
(including  the  basics  of  probability) , 
forecasting    (mainly  multiple  linear  re- 
gression) ,   simulation,   competitive  de- 
cision making    (elementary  game-theoretic 
concepts),   and  linear  programming.  The 
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next  two  sections  will  describe  this 
latter  segment. 

2 .     Managerial  Economics  Day 

The  introductory  sessions  on  linear 
programming,   euphemistically  called  Mana- 
gerial Economics  Day,   take  up  an  entire 
day  for  the  first-year  MBA's,  during 
which  interactive  sessions  are  inter- 
spersed with  discussion  sessions.  Our 
objectives  are  three-fold: 


enough)  and  this  allows  the  introduction 
of  the  vertex-following  concept. 


1) 


2) 


3) 


to  introduce  students  to  the 
recognition  of  situations  in 
which  linear  programming  might 
be  useful, 

to  indicate  the  basic  structure 
of  a  linear  programming  model, 
and 


to  introduce  students  to  the 
interpretation  of  linear  pro- 
gramming solutions  through  an 
interactive  computer  package. 

The  case  used  in  this  introductory 
day  is  Sherman  Motors*,   an  example  adapt- 
ed from  Robert  Dorfman's  "Mathematical 
or   'Linear'   Programming:     A  Nonmathemati- 
cal  Approach,"     American  Economic  Review, 
December  1953.     The  example  is  a  simple 
resource  allocation  problem  involving  a 
motor  manufacturing  firm  with  two  truck 
models,  model  101  and  102,   to  be  produced 
using  four  kinds  of  capacity:  metal 
stamping,   engine  assembly,   101  assembly 
and  102  assembly.     Given  the  data  in  the 
case,   the  linear  programming  model  below 
can  be  formulated. 


maximize 
subject  to 
metal  stamping: 
engine  assembly 

101-  assembly : 

102-  assembly : 


300         +  350  N2 


N-j^  +  5/7         i  2500 


+  2 


N2  <  3333 
<  2250 
N2  <  1500 


N-j_,   N2   >  0 


where  N^^  and  N2 


are  , 


respectively,  the 

monthly  production  of  101 's  and  102 's. 
This  simple  model  is  depicted  graphi- 
cally in  the  case  and  students,  with 
no  previous  introduction,   are  assigned 
the  task  of  deciding  what  to  do  with  it. 

The  example  provides  a  good  tran- 
sition from  situations  involving  uncer- 
tanties  but  few  complex  constraints, 
to  those  where  the  uncertainty  is  minor, 
but  the  constraints  are  essential.  The 
simple  formulation  permits  the  easy  dis- 
cussion of  basic  concepts  such  as  deci- 
sion variables,  objective  function, 
feasible  set,   and  so  on.     Many  students 
analyze  the  graphical  formulation  by 
suggesting  some  vertex  comparing  strategy 
(in  this  example,   enumeration  is  easy 


At  this  po 
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ming  and  a  more 
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example,  the  st 
the  interactive 
will  be  using  a 
for  their  first 
brief  descripti 
needed  here . 


int ,   after  a  general  dis- 
issues  of  linear  program- 
specific  discussion  on 
of  the  Sherman  Motors 
udents  are  introduced  to 

computer  package  they 
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set  of  exercises.  A 
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CLP*,   conversational  linear  program- 
ming,  as  it  is  called,  operates  by  asking 
a  number  of  directed  questions  and  allow- 
ing the  user  to  sequentially  request  a 
series  of  options.     It  allows  the  user  to 
input  a  linear  programming  model,   to  edit 
and  file  the  specifications,   and  to  re- 
ceive output  in  several  ways.     The  output 
for  the  Sherman  Motors  example  is  given 
in  Exhibit  I.     Output  section  1  gives  the 
optimal  solution;   section  2  the  values  of 
the  slack  and  surplus  variables;  section 
3  the  value  of  the  dual  variables  or 
shadow  prices;   section  4  the  reduced 
costs  for  non-basic  variables,   i.e.,  the 
decrease  in  the  objective  which  would 
occur  if  the  variable  was  introduced 
into  the  basis  at  a  unit  level.  Output 
sections  5  and  6  give  ranges  on  the  ob- 
jective function  and  right  hand  side 
values  for  purposes  of  sensitivity  anal- 
ysis.    Section  5  gives  ranges  on  the 
coefficients  of  the  objective  function 
such  that  the  optimal  solution  remains 
unchanged  as  long  as  the  coefficient 
remains  within  that  range.  Similarly, 
section  6  gives  ranges  on  the  right  hand 
side  values  within  which  the  shadow 
prices  are  constant. 

The  first  set  of  computer  exercises 
is  given  in  Exhibit  II.     Their  purpose  is 
to  introduce  the  students  to  the  conver- 
sational linear  programming  package  -  in- 
put,  edit,   and  optimize,   and,  more  im- 
portantly,  to  present,   in  an  experiential 
way,   the  concept  of  a  shadow  price.  Thus, 
in  exercises   3  and  4 ,   the  students  are 
asked  to  change  the  amount  of  the  avail- 
able engine  capacity  from  3333  to  3334, 
then  to  4333,   then  to  4433.     The  first 
change  causes  an  increase  in  the  objec- 
tive function  equal  to  the  shadow  price 
for  that  constraint,   the  second  change 
causes  an  increase  of  1000  times  the 
shadow  price,   and  the  third  change,  which 
takes  the  right  hand  side  value  outside 
the  range  prescribed   (see  Exhibit  I,  out- 
put section  6)   and  therefore  requires  a 
basis  change,   causes  an  increase  of  less 
than  the  1100  times  the  shadow  price  

*This  package  was  developed  by  Stephen 
Bradley.     For  more  information,  see 
"Using  the  Conversational  Linear  Program- 
ming System,"   ICH  #9-172-240 
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which  the  students  expect.     Most  students 
when  examining  the  output,   which  has  not 
yet  been  interpreted  for  them,  will 


Exhibit  I 
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Exhibit  II 


notice  that  the  number  listed  under  shadow 
price  for  engine  assembly  capacity  is  the 
same  as  the  increase  in  the  objective 
function  which  they  have  just  observed. 

After  the  completion  of  this  first 
computer  exercise,   there  is  a  return  to 
the  classroom  for  a  45-minute  class  ses- 
sion in  which  the  instructor  answers 
questions  and  reinforces  the  shadow  price 
concept.     The  students  then  return  to  the 
terminals  for  a  second  set  of  exercises 
as  shown  in  Exhibit  III.     It  is  more  com- 
plex than  the  first  and  requires  reformu- 
lation rather  than  simply  revision  of 
coefficients.     The  first  question  intro- 
duces a  third  product  into  the  production 
process.     It  is  important  to  note  that 
this  new  truck  is  not  produced  in  the  re- 
vised optimal  solution.     The  students  are 
thus  presented,  again  experientially 
rather  than  directly,  to  the  notions  of 
reduced  cost  and  sensitivity  on  the 
objective  function  coefficients.  The 
second  question  requires  a  complete  refor- 
mulation with  the  addition  of  two  new 
variables  and  a  new  constraint.     It  pro- 
vides the  framework  for  a  concluding 
discussion  of  the  CLP  output  when  the 
students  return  for  their  third  class 
session.     Exhibit  IV  gives  a  typical 
schedule  for  this  introductory  linear  pro- 
gramming day. 

The  student's  reactions  to  this 
method  are  positive.     Linear  program- 
ming is  a  topic  which  the  MBA's  at  first 

Exhibit  III 

SECO'D  S"":'  OF  i'Xi'F.CISES 

Note:     For  earh  of  these  questions,   start  from 
the  basic  assurpptions  of  the  case.  (Do 
not  carry  over  additional  assunptions 
from  one  n-jirJaered  question  to  the  next.) 

1.     Sherman  Motors  is  considering  introducino  a 
new  economy  truck,   to  be  called  >:odel  103. 
The  total  netal  sta.niping  capacity  would  be 
sufficient  for  3,  000  Model  "l03's  -.ler  month, 
while  the  total  engine  assembly  shop  would 
be  enough  for  2500  Model  103 's".     The  103 's 
could  be  assembled  in  the  101  assembly  de- 
partm.ent;   each  103  would  require  only  half 
as  much  time  as  a  101.     Each  Model   103  wo^ild 
give  a  contribution  of  5225. 


SHERIIAN  MOTOa  COJiPANY 


For  each  of  these  questions, 
the  basic  assumptions  of  the 


Jtart 


;  rom 
(Do 


not  carry  over  additional  assumptions 
from,  one  numbered  question  to  the  next.) 

1.  Find  the  best  product  mix  for  Sherman  Motors 
under  the  basic  assumptions  of  the  cise. 

2.  Find  the  best  product  nix  if  it  is  found 
that  the  stamping  department  capacity  if 

3,  500  Model  102 's  (as  in  t.he  case)  but  only 
2,000  Model  lOl's. 

3.  Khat  is  the  best  product  mix  if  engine  as- 
sembly capacity  is  raised  to  3334?     Khat  is 
an  extra  unit  of  engine  assumbly  capacity 
worth? 

4.  Khat  are  1,000  additional  units  of  Model  101 
equivalent  engine  assembly  capacity  worth? 
What  about  1,100  units? 


a)  Formulate  the  production  decision  with 
the  three  trucks  as  a  linear  program- 
ming problem  and  then  solve  the  problem 
with  MBALP  to  verify  that  no  Model  103 's 
should  be  produced. 

b)  How  much  would  it  cost  in  terms  of  con- 
tribution if,   for  some  other  reasons, 
management  insisted  that  at  least  one 
Model  103  be  made? 

c)  How  high  would  the  contribution  on  each 
103  have  to  be  before  it  became  attrac- 
tive to  produce  the  new  model? 

The  engine  assembly  line  can  be  put  on  over- 
time.    Suppose  that  efficiencies  don't  change 
and  that  2000  units  of  overtime  capacity  are 
available.     If  direct  labor  costs  increase 
by  50%  on  overtime  and  if  the  fixed  overhead 
on  the  line  on  overtime  is  $40,000,  the  var- 
iable overhead  remaining  the  same,  would  it 
pay  to  go  on  overtime? 
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Exhibit  IV 
rYPICAL  CLASS   SCHEDULE  FOR  M.E.  DAY 


9 

1 3- 

10 

30 

First  Class  Session 

10 

45- 

11 

45 

First  Computer  Session 

11 

4  5- 

12 

30 

Second  Cless  Session 

12 

30- 

2 

00 

Lunch  and  Preparation  of 
Exercise  Set 

2 

00- 

2 

45 

Second  Computer  Session 

2 

45- 

3 

30 

Final  Class  Session 

find  difficult  and  this  interactive  ap- 
proach gives  them  immediate  feedback  and 
response.     Rather  than  being  thrown  to 
the  computer  with  no  class  discussion  un- 
til,  at  earliest,   the  next  day,   they  have 
their  problems  cleared  up  immediately. 
For  the  most  part,   they  are  willing  to 
give  up  the  better  part  of  a  day  for  this 
experience,   and  the  increased  level  of 
learning  is  noticable. 

3 .     Sessions  Following  M.E.  Day 

After  Managerial  Economics  Day,  a 
sequence  of  four  hour  and  twenty  minute- 
classes,   spanning  a  two  week  period,  are 
conducted  pn  further  aspects  of  linear 
programming  fundamentals.     The  primary 
purpose  of  these  classes  are  to  reinforce 
the  concepts  which  were  introduced  during 
Managerial  Economics  Day  and  to  show  how 
linear  programming  can  be  implemented  in 
a  variety  of  decision  environments.  The 
pedagogical  format  is  as  follows.  We 
began  the  first  of  the  four  classes  with 
a  set  of  exercises  in  which  the  students 
are  expected  to  successfully  formulate 
four  smaller  linear  programs,   run  them 
on  CLP,   and  interpret  the  solution  re- 
sults.    The  students  are  then  gradually 
introduced  to  more  difficult  formula- 
tions in  the  remaining  three  days.  By 
the  final  class,   they  have  become  famil- 
iar with  some  of  the  basic  issues  involv- 
ing model  formulation  and  selection. 
During  these  four  sessions  a  number  of 
other  issues  arise,   such  as  how  to  deal 
with  uncertainty  and  the  mechanism  for 
pricing  out  the  non-basic  variables.  The 
highlights  of  these  four  class  sessions 
are  described  in  the  remainder  of  this 
section. 

The  first  session  consists  of  simple 
formulation  exercises  --  a  machine  shop 
problem,   a  transportation  network,   an  ad- 
ditional marketing  constraint  on  the  Sher- 
man Motors  problem,   and  a  simple  bond 
portfolio  investment  in  which  the  shadow 
prices  are  not  particularly  meaningful 
because  the  constraints  take  the  form  of 
ratios.     A  topic  of  note  in  the  teaching 
of  these  exercises  is  the  manner  in  which 
the  student's  participate  in  teaching  as 
well  as  learning.     Since  interaction  is  a 


crucial  aspect  of  the  case  method,  the 
instructor  may  start  the  discussion  by 
asking  a  non-quantitative  student  to  show 
how  he  or  she  solved  the  first  exercise. 
If  there  is  little  difficulty  with  this 
aspect,   another  non-quantitative  student 
might  be  asked  to  qualitatively  describe 
the  formulation.     Eventually  the  conver- 
sation gets  around  to  discussion  and  de- 
fining the  various  basic  types  of  con- 
straints that  can  occur  in  a  linear 
program : 

-  product    (quality  mix) , 

-  capactiy  or  resource  limits,  and 

-  supply-use  constraints  (flow). 

If  there  had  been  a  severe  problem 
with  the  initial  formulation,  the  follow- 
ing pedagogical  device  has  proven  useful. 
The  confused  student  is  asked  to  concise- 
ly and  verbally  describe  any  single 
constraint  while  the  instructor  tran- 
scribes the  verbal  description  onto  the 
blackboard.     At  this  point,   the  instructor 
can  either  show  how  the  sentence  can  be 
easily  parsed  into  an  equation  or  ask  for 
volunteers  to  do  likewise.     In  this  manner 
the  poets  are  shown  that  almost  anyone 
can  formulate  mathematically  a  linear  pro- 
gram, provided  a  precise  description  of 
each  restriction  is  developed.     We  are 
careful  not  to  overwhelm  the  students 
with  mathematical  sophistication  at  the 
beginning,   yet  ensure  that  their  logical 
reasoning  is  sound  and  rigorous. 

The  next  class  session  depicts  a 
simple  capital  investment  decision  prob- 
lem,  i.e.,   the  Mitchell  Enterprises  case.* 
Here  the  problem  is  to  formulate  a  linear 
programming  model  for  investing  in  five 
projects    (A,   B,   C,   D,   E)   over  a  four  year 
horizon  with  the  objective  of  maximizing 
the  accumulated  value  of  the  portfolio. 
An  opportunity  rate  is  not  provided;  how- 
ever,  a  6%  short-term  bank  account  is 
available  for  any  money  which  is  not  in- 
vested in  a  given  year.     Exhibit  V  depicts 
the  cash  flows  per  dollar  invested  for 
each  of  these  projects  and  years. 

Exhibit  V 
CASH  FLOW  PER  DOLLAR  INVESTED 


Project : 

A 

B 

C  D 

E 

Year  1975 

-1.00 

0 

-1.00  -1.00 

0 

1976 

+  .30 

-1.00 

+1.10  0 

0 

1977 

+  1.00 

+  .30 

0  0 

-1 .  00 

1978 

0 

+  1.00 

0  +1.75 

+  1.40 

With  this  information,   the  students  are 
asked  to  formulate  and  evaluate  a  linear 
programming  model  for  this  problem.  In 

*Intercollegiate  Case  Clearing  House 
#4-176-160 
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addition  to  the  case,   they  are  given  in 
advance  a  supplement  containing  the  CLP 
solution  to  this  problem.     In  the  class 
discussion,   the  following  important  topics 
are  usually  covered: 

1)  a  review  of  model  formulation, 

2)  a  demonstration  that  the  oppor- 
tunity rates  can  be  derived  from 
the  optimal  shadow  prices,  and 

3)  a  proof  that  opportunity  rates 
are  not  necessarily  constant  for 
each  year. 

In  addition,   a  brief  discussion  of  how  to 
deal  with  uncertainties  via  sensitivity 
analysis  takes  place  if  time  remains. 
This  case  has  been  well  received  because 
of  its  implications  on   traditional  dis- 
counting methods  -  a  firm's  hurdle  rate 
depends  on  the  set  of  opportunities  it 
has  available.     In  addition,   it  is  not  a 
typical  production  scheduling  situation  — 
which  many  synonymize  with  linear  program- 
ming models. 

On  the  third  day  the  Red  Brand  Can- 
ners*  case  is  used.     This  case  involves 
the  production  of  canned  tomatoes,  tomato 
juice,   and  tomato  paste.  Information 
about  fixed  and  variable  costs,  allocated 
overhead  expenses,  marginal  profit,  and 
so  forth,   is  provided.     The  students  are 
again  asked  to  formulate  the  linear  pro- 
gramming model  and  to  defend  their  form- 
ulation in  class.     This  session  is  note- 
worthy because  the  case  forces  students 
to  extract  the  relevant  accounting  fig- 
ures and  to  use  this  information  within 
the  model.     It  also  vividly  brings  out 
the  limiting  assumptions  of  linear  pro- 
gramming,  i.e.,   constant  returns  to  scale, 
continuous  variables,   non-negativity  and 
so  on.     As  far  as  pedagogy  is  concerned, 
the  class  discussion  can  be  constructed 
to  compare  linear  programming  with  "back 
of  the  envelope-seat  of  the  pants"  ap- 
proach.    To  accomplish  this  comparison, 
there  is  an  appealing,   but  incorrect, 
formulation  which  can  be  proposed  by  the 
instructor  as  the  easiest  way  to  solve 
this  scheduling  problem.     A  "better"  stu- 
dent will  counter  by  describing  why  the 
intuitive  approach  is  incorrect  and 
giving  the  correct  formulation.  Often, 
other  students  are  unprepared  to  chal- 
lenge this  alternative  formulation  and 
the  class  anxiety  rises  until  the  logi- 
cal fallacy  is  uncovered.     The  purpose 
of  this  subterfuge  is  demonstrating  to 
the  students  that  they  must  be  prepared 
to  defend  their  formulation  to  their 
superiors  even  though  their  superiors 
may  not  understand  mathematical  models 
or  symbols.     This  important  lesson  is 
difficult  to  accomplish  outside  the  case 
format. 
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In  the  final  linear  programming 
class  session,   the  Okanagan  Lumber  Com- 
pany* case  is  presented.     This  case  in- 
volves the  production  of  plywood  paneling 
from  various  types  of  lumber.     The  inter- 
mediate steps  in  the  plywood  mill  are 
described  in  detail.     The  problem  of 
scheduling  and  allocating  resources  is 
depicted  as  a  linear  program.     Since  the 
model  has  approximately  50  constraints 
and  75  variables,   the  tableau  is  provided 
to  the  students.     They  are  asked  several 
sensitivity  analysis  questions  and  there- 
by develop  an  appreciation  of  the  model. 
By  this  time,   the  students  see  that  they 
can  understand  even  a  rather  complex  form- 
ulation if  given  enough  time  to  sift 
through  the  details.     Because  of  its  com- 
plexity,  there  are  many  ways  to  teach 
this  case.     Perhaps  the  most  important 
lesson  to  be  learned  is  that  the  modeling 
process  can  uncover  new  avenues  rather 
than  just  showing  how  to  travel  old  paths. 
To  demonstrate  this  idea,   the  instructor 
can  utilize  the  shadow  prices  in  conjunc- 
tion with  prevailing  market  prices  for 
lumber  to   "invent"  new  product  lines.  In 
essence  the  students  learn  how  to  price- 
out  non-basic  variables  -  whenever  a  com- 
bination of  marginal  profits  minus 
marginal  costs  is  positive  the  product  is 
profitable  and  worth  introducing,  which 
is  equivalent  to  saying  that  the  product's 
reduced  cost  is  positive  and  can  thereby 
enter  the  basis  at  a  positive  level.  This 
case  concludes  the  linear  programming  seg- 
ment of  the  Managerial  Economics  course. 

As  a  final  observation,  we  compared 
the  case  approach  with  the  usual  lecture 
method.     One  of  the  authors  taught  linear 
programming  to  M.S.   and  Ph.D.  students 
from  Harvard's  Division  of  Engineering 
and  Applied  Mathematics.     This  class  fol- 
lowed a  more  traditional  lecture  format 
and  the  students  generally  had  more  exten- 
sive mathematical  training   (linear  alge- 
bra,  etc.)    than  MBA  students.     The  same 
case  given  to  MBA's  on  their  final  exam 
was  given  to  these  students  as  a  part  of 
their  final  examination.     It  is  interest- 
ing to  observe  that  the  non-quantitative 
MBA's  examination  results  compare  favor- 
ably with  the  Ph.D.   student's  examination 
results.     The   "poets"  were  especially 
adept  at  putting  the  model  in  context  with 
respect  to  the  total  investment  decision  - 
including  how  to  deal  with  the  underlying 
uncertainties . 

4 .  Conclusions 

We  have  attempted  to  show  in  this 
article  how  linear  programming  fundamen- 
tals can  be  successfully  taught  to  a 
diverse  audience  of  MBA  students.  A 
parallel  situation  exists  within  an 

*Intercollegiate  Case  Clearing  House 
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executive  management  program  at  Harvard, 
in  which  highly  experience  managers  re- 
turn to  academia  for  a  13  week  retraining 
period.     The  linear  programming  aspects 
i  of  this  program  are  almost  identical  to 
!  those  described  in  Sections  2  and  3  of 
this  paper. 

Although  we  feel  that  our  peda- 
gogical approach  accomplished  the  goals 
which  have  been  set  forth,  several  im- 
portant questions  remain.     First,  does 
the  successful  application  of  sophisti- 
cated mathematical  models  require  an 
appreciation  by  the  decision  maker  of 
the  solution    strategies  in  the  model? 
An  airline  pilot  may  not  comprehend  the 
physical  laws  of  aerodynamics,  but  he 
can  still  fly.     Nonetheless,  a  thorough 
understanding  of  the  airplane  can  be  an 
advantage  to  the  pilot  whenever  something 
goes  wrong.     Perhaps  this  is  why  auto- 
mobile race  drivers  are  intimately 
familiar  with  the  mecanical  aspects  of 
their  machines. 

VJe  suspect  that  the  pertinent  answer 
to  this  question  lies  in  the  resolution 
of  the  tradeoff  between  intention  and 
practicality.     In  other  words,  since 
there  is  limited  time  for  learning, 
priorities  must  be  established.  The 
chosen  mix  of  applications  and  theory 
I  depends  upon  how  one  resolves  these 
issues . 
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As  a  second  question,  how  do  busi- 
ness schools  deal  with  field  study,  the 
recent  phenomenon  which  is  occuring  in 
many  law  schools?     Here  teams  of  students 
address  real-world  problems  by  going  out 
into  government  or  business.  Because 
of  the  size  of  the  MBA  program  at  the 
Harvard  Business  School,   it  would  be 
difficult  to  implement  such  a  program; 
however,  this  topic  will  need  to  be 
further  explored. 

Thirdly,  will  the  dramatic  improv- 
ments  in  mini-computers,  telecommunica- 
tions, and  self-paced  learning  techniques 
have  a  significant  impact  on  how  linear 
programming  is  taught?     One  of  our 
associates  has  developed  a  cassette-based 
system*  for  decision  analysis,  and  simi- 
lar ideas  could  be  easily  exploited  in 
linear  programming.     Several  business 
schools  are  already  engaged  in  develop- 
ing self-paced  instructional  materials. 


*This  system  was  developed  by  Howard 
Raiffa.     For  more  information,  see 
Analysis  for  Decision-Making,  Encyclopedia 
Britannica,  1974. 
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ABSTRACT 

In  this  paper  the  authors  discuss  their  expe- 
riences in  converting  a  lecture-oriented  mathemat- 
ical programming  course  to  the  Self-Paced  Instruc- 
tional (PSi)  format.     This  is  an  elective  course 
for  students  from  all  the  departments  (Engineering, 
Science,  and  IVfenagement) .     We  will  discuss  the  dif- 
ferent strategies,  including  the  development  of 
conversational  computer  programs,  which  were  em- 
ployed to  implement  the  PSI  concept  in  this  course 
so  that  its  educational  objectives  are  fully  met. 
We  will  also  discuss  our  experiences  with  the  PSI 
system,  its  pros  and  cons,  and  the  students'  re- 
sponse to  self-paced  learning. 

Introduction 

Undergraduate  engineering  curricula  have 
become  much  more  flexible  during  the  past  half- 
dozen  or  so  years.     Students  are  able  to  obtain 
a  measure  of  specialization  in  one  or  two  areas 
within  a  specified  engineering  discipline;  for 
example,  in  the  School  of  Industrial  Engineering 
at  Purdue  the  undergraduate  students  can  special- 
ize in  Operations  Research,  Systems  Engineering, 
Human  Factors,  Management,  and  lyfenufacturing 
Processes.     I^bst  of  this  flexibility  has  been 
obtained  by  increasing  the  number  of  elective 
courses  allowed  in  the  curriculum  without  in- 
creasing the  credit  hours  required  for  gradua- 
tion. 

The  author  has  been  teaching  for  the  past  sev- 
en years  a  course  on  tfethematical  Programming  for 
undergraduate  and  graduate  students.     This  is  an 
elective  course  for  all  industrial  engineering 
students.     In  addition  to  students  from  other 
branches  of  engineering,  the  course  attracts  stu- 
dents from  industrial  management,  agricultural 
economics,  mathematics,  computer  science,  and 
statistics.     Because  of  this,  the  composition  of 
the  class  and  the  interests  of  the  students  vary 
widely.     The  course  is  offered  three  times  a  year 
and  attracts  150-200  students. 


The  basic  objective  of  this  course  is  to  in- 
troduce the  students  to  mathematical  modeling  (in 
particular  linear  programming  models)  for  solving 
real-world  problems.     The  major  portion  of  the 
course  deals  with  solution  techniques  for  these 
models.    From  past  experience  in  teaching  this 
course,  it  has  been  observed  that  the  non-engineer- 
ing students  prefer  to  see  more  emphasis  on  linear 
programming  theory  and  computational  methods.  In 
contrast,  the  engineering  and  business  students 
want  more  emphasis  on  case  studies  describing  nu- 
merous applications  of  linear  programming.  They 
even  express  varied  interests  in  the  linear  pro- 
grariiiuitig  toyics  to  be  discussed  in  class.     For  in- 
stance, the  mathematics  students  want  the  inclusion 
of  advanced  topics  like  game  theory,  and  decomposi- 
tion methods.     The  computer  science  students  like 
to  see  the  computer  implementation  of  linear  pro-  . 
gramming  algorithms.     The  civil  engineers  show 
more  interest  in  topics  like  network  analysis, 
transportation  problems,  and  critical  path  methods. 
This  poses  enormous  problems  in  structuring  and 
teaching  this  course.     In  previous  years,  a  middle- 
of-the-road  approach  was  taken,  so  that  it  would 
hopefully  satisfy  most  of  the  students'  interests. 

Use  of  Personalized  Self-Paced  Format  ^ 

Traditionally  all  students  enrolled  in  a  spe- 
cific course  progress  through  the  material  defined 
by  the  syllabus  at  a  given  rate.     Concepts  pre- 
sented in  a  textbook  are  supplemented  with  regu- 
larly scheduled  lectures  by  the  course  instructor, 
homework  problems  are  turned  in  at  the  proper  time, 
lab  experiments  are  performed  during  scheduled  1 
hours,  and  hour  tests  are  taken  in  class  size  1 
groups  when  the  instructor  schedules  them.  For- 
tunately this  is  not  a  rut  in  which  the  innovative 
teacher  must  remain.     Through  the  years  we  have 
seen  the  growth  in  use  of  concepts  of  indiviual- 
ized,  self-paced  instruction  and  open  laboratories 
(see  References  [1],  [2],  [3],  and  [k]).  The 
American  Society  for  Engineering  Education  has  been 
concerned  with  effective  college  teaching  for  a 
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number  of  years  and  usually  devotes  the  March  issue 
of  Engineering  Education  each  year  to  articles  con- 
cerned with  innovative  methods  of  teaching  in 
engineering  curricula.     A  recent  survey  by  Moodie 
[5]  reports  various  innovative  teaching  methods 
used  in  industrial  engineering  courses  across  the 
country.     However,  we  often  find  that  the  major  dif- 
ference between  the  self-paced  version  of  a  course 
and  its  lecture  counterpart  is  merely  the  self- 
pacing;  there  is  still  a  strict  list  of  topics  to 
be  covered.     Variation  within  the  listed  course 
content  is  only  offered  to  the  student  through  the 
vehicle  of  writing  a  term  paper  on  a  special  topic 
of  his  interest,  not  included  in  the  syllabus. 

The  author  of  this  paper  felt  the  need  to 
expand  the  amount  of  flexibility  offered  to  a  stu- 
dent enrolled  in  the  mathematical  programming 
course  beyond  that  which  is  described  in  the  pre- 
ceding paragraphs.     Because  of  the  diversified 
interests  found  among  the  students,  it  was  felt 
that  a  new  Personalized  Self-Paced  Instructional 
(PSi)  system  could  best  serve  the  students'  needs. 
The  new  PSI  system  developed  by  the  primary  author 
[7]  combines  the  traditional  lectures,  self-paced 
and  mastery  learning  to  provide  maximum  flexibili- 
ty.    To  obtain  any  degree  of  course  topic  flexi- 
bility it  is  necessary  to  evaluate  each  course  in 
terms  of  its  educational  objectives.     What  is  the 
terminal  behavior  we  desire  of  a  student  when  he 
completes  this  course?    Then,  in  order  to  make 
additional,  optional  material  available  to  students 
enrolled  in  this  course,  it  is  necessary  to  deter- 
mine the  minimum,  necessary  requirements  from  the 
original  course  content.     In  this  way  specific 
topics  are  made  required  content  of  this  course. 
After  this  is  completed,  the  additional  (optional) 
material  is  made  available  to  the  student  in  the 
form  of  elective  modules. 

Thus,  in  the  new  system,  the  subject  matter 
;      for  this  course  is  divided  into  some  basic  units, 
and  some  optional  units.     Class  lectures  are  given 
for  the  basic  units  only.     The  students  elect  the 
optional  units  they  want  to  learn  according  to 
their  interests,  and  prepare  for  them  on  their  own 
time  (off  class  hours).     For  this  reason,  one- 
third  of  the  scheduled  class  hours  are  cancelled 
1      to  provide  time  for  independent  study. 

BASIC  UNITS:  The  basic  units  are  so  chosen  that 
'      they  provide  the  students  a  minimal  amount  of 

knowledge  in  the  fundamentals  of  linear  program- 
I     ming.     These  include: 

Unit  1  -  Formulation  of  Linear  Programs:  To 
']  construct  linear  programming  models  of  real  world 
I;     problems . 

V  Unit  2  -  The  Simplex  Method:     To  learn  how 

r.     systems  of  linear  equations  are  solved;  to  under- 
1;     stand  the  basic  mathematical  principles  underlying 
1'     the  simplex  algorithm;  to  use  the  tableau  form  of 
the  simplex  method  to  solve  small  problems;  and  to 
use  the  computer  code  for  solving  large  linear 
programs. 

Unit  3  -  Duality  Theory  and  its  Applications : 
Symmetric  and  asymmetric  duals,  economic  interpre- 
tation, duality  theorems  and  applications,  shadow 
prices,  and  the  dual  simplex  method. 


Unit  k  -  Sensitivity  Analysis:     Variation  of 
cost  coefficients ,  changing  KHS  constants,  changing 
constraint  matrix,  adding  new  constraints  and  para- 
metric programming  with  respect  to  cost  and  RHS 
vectors . 

OPTIONAL  UNITS:     The  optional  units  are  selected  to 
provide  diversification,  and  meet  the  varied  and 
special  interests  of  the  students.     For  instance, 
during  the  summers  of  '7'+,    '75,  and  'jG,  when  the 
PSI  system  was  tried  for  this  course,  the  following 
optional  units  were  offered  to  the  students: 

Unit  5  -  Project  Networks  and  PERT/CPM 

Unit  6  -  Transportation  and  Assignment  Prob- 
lems 

Unit  7  -  Variants  of  the  Simplex  Method 

Unit  8  -  Nonlinear  Optimization 

Unit  9  -  Bounded  Variable  Linear  Program 

Unit  10  -  Bimatrix  Games 

Unit  11  -  Integer  Programming 

In  addition,  an  individual  project  is  always 
made  available  as  an  optional  unit.     For  each  of 
the  units  (basic  and  optional)  the  students  are 
provided  with  study  guides.     Russell  [8]  calls 
these  "modular  materials"  and  provides  an  excel- 
lent guide  to  the  design,  selection,  utilization, 
and  evaluation  of  these  study  guides.  Basically, 
the  study  guides  contain  the  objectives  of  the  unit, 
topical  outline,  reading  materials,  and  sample  prob- 
lems.    Examinations  on  various  units  are  given 
throughout  the  semester.     Though  excellence  is  re- 
quired to  pass  a  unit,  the  students  can  repeat  the 
unit  examinations  any  number  of  times  without 
penalty.     For  a  minimal  passing  grade  in  the  course, 
the  students  have  to  complete  all  the  basic  units. 
Assignment  of  higher  grades  is  a  function  of  the 
number  of  optional  units  they  pass. 

Computer  Aids  in  the  PSI  System 

From  the  initial  experience  in  teaching  this 
course  in  PSI  format,  it  was  found  that  consider- 
able time  was  spent  with  the  students,  by  both  the 
professor  and  the  teaching  assistants,  in  helping 
them  learn  the  optional  units  for  which  there  were 
no  formal  lectures.     This  increased  the  manpower 
needs  of  this  course.     It  was  felt  that  this  could 
be  reduced  to  a  great  extent  by  using  the  computer 
as  a  teaching  aid  for  self-paced  learning.  The 
computer-aided  teaching  can  help  them  learn  the 
various  linear  programming  techniques  and  their 
applications  on  their  own  time  (on/off  class 
hours),  and  at  their  own  pace. 

With  the  help  of  a  teaching  grant  from  the 
Purdue  Parents'  Association,  interactive  computer 
programs  have  been  developed  in  such  a  way  that 
they  will  illustrate  and  test  the  students '  under- 
standing of  the  subject  matter.     The  students  solve 
problems  illustrating  various  linear  programming 
algorithms  in  a  conversational  mode  by  answering 
a  series  of  questions.     The  programs  are  designed 
so  that  the  computer  does  all  the  time  consuming 
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arithmetic  calculations,  while  the  "thinking"  is 
done  by  the  students.     The  student  essentially  di- 
rects the  computer  to  calculate  whatever  quanti- 
ties are  needed  for  solving  the  problem  by  a  given 
algorithm.    Based  on  these,  he/she  instructs  the 
computer  to  proceed  step  by  step  in  the  required 
manner  to  arrive  at  the  optimal  solution  to  the 
problem.    We  have  provided  in  the  program  a  veri- 
fication routine  so  that  all  the  steps  and  in- 
structions given  by  the  students  are  checked  by 
the  computer  for  correctness. 

The  arithmetic  calculations  in  linear  pro- 
gramming techniques  generally  consume  more  than 
90fc  of  the  total  time  in  solving  a  problem.  Since 
this  time  is  eliminated  by  the  computer,  the  stu- 
dents are  able  to  use  their  study  time  more  ef- 
fectively.    They  can  solve  a  variety  of  problems 
to  test  their  understanding.     A  series  of  illus- 
trative problems  emphasizing  various  concepts  and 
practical  difficulties  encountered  in  solving 
linear  programs  are  also  available  to  the  students. 
Thus  a  student  gets  a  better  and  a  more  complete 
exposure  to  the  subject  matter  itself.     At  present 
the  interactive  programs  are  used  to  reinforce 
methodology  and  to  interpret  results.     The  unit 
examinations  do  not  utilize  these  interactive 
programs . 

Description  of  the  Interactive  Computer  Program 

Conversational  Features 

The  program  is  completely  conversational.  All 
input  and  output  is  handled  via  remote  terminals. 
The  system  always  directs  the  user  to  the  next  step 
of  the  simplex  algorithm  by  asking  a  question  re- 
lated to  some  aspect  of  the  algorithm.     At  all 
times,  the  user's  response  is  checked  for  correct- 
ness.    If  the  user  should  respond  incorrectly  to 
one  of  the  system's  queries,  an  immediate  feed- 
back will  be  sent  to  the  user  with  an  explanation 
of  why  the  response  is  incorrect  and  how  to  go 
about  finding  the  correct  answer. 

Error  Detection  for  Inappropriate  Student  Responses 

Each  student  response  to  a  system  question  is 
checked  for  correctness.     If  the  student's  answer 
is  correct,  the  next  question  is  asked  by  the  sys- 
tem.    However,  if  the  response  is  not  correct, 
feedback  explaining  the  error  is  given  at  the  ter- 
minal.    In  some  cases,  the  same  question  is  re- 
asked  and  the  student  is  given  further  opportunity 
to  respond  correctly.     In  other  instances,  the 
system  supplies  the  correct  answer  and  then  pro- 
ceeds to  the  next  question. 

Error  checking  is  accomplished  by  having  the 
system  solve  the  given  linear  program  step-by-step 
as  the  user  solves  the  same  problem.     For  instance, 
before  asking  the  user  to  supply  the  index  of  the 
pivot  row  in  the  simplex  algorithm,  the  system  will 
have  determined  that  information.     Then  the  user 
will  be  asked  to  supply  the  same.     After  the  user 
enters  the  response,  it  is  a  simple  matter  to 
check  the  response  against  the  result  already  de- 
termined by  the  system.     If  the  user's  response  is 
correct,  the  next  question  is  printed  on  the  ter- 
minal.    If  the  response  is  incorrect,  a  message  to 
that  effect  is  printed  with  a  brief  explanation  of 


why  the  response  is  not  correct. 

For  instance,  on  line  A  (pg.  5)  of  the  enclosed 
example  in  the  Appendix,  the  student  is  asked,  "IS 
THE  PROBLEM  IN  STAITOARD  FORM?".     In  this  case,  the 
problem  is  not  in  standard  form.     But  the  student 
has  responded,  "YES".     Thus,  the  feedback  given  is 
"INCORRECT  RESPONSE  -  THE  PROBLEM  IS  NOT  IN  STAN- 
DARD FORM,  NOT  ALL  OF  THE  CONSTRAINTS  ARE  EQUALI- 
TIES".    Then  the  next  question  is  generated. 

User  Assistance 

In  addition  to  the  brief  explanations  provid 
ed  when  the  user  gives  an  incorrect  response, 
another  form  of  explanatory  assistance  is  offered 
by  the  system.     At  several  points,  the  student  is 
asked  to  name  the  next  step  of  the  algorithm.  The 
possible  responses  and  their  meanings  are: 

RELPROFIT  -  print  the  relative  profit 
vector 


RATIOS 


PIVOT 


calculate  ratios  of  right- 
hand  sides  divided  by  the 
corresponding  entry  of  the 
pivot  column 

perform  a  pivot  operation 


In  any  instance  where  one  of  the  above  responses  i; 
the  correct  one,  the  user  may  type  "HELP"  if  he  is 
uncertain  of  which  choice  is  the  correct  one.  Af- 
ter "HELP"  is  typed,  a  message  indicating  which  is 
the  correct  choice  and  why  it  is  correct  will  be 
provided.  A  possible  extension  would  be  to  allow 
the  user  to  ask  for  "HELP"  at  any  point  during  the 
execution  of  the  program.  This  should  be  included 
in  the  next  version  of  the  system. 


An  example  of  the  use  of  this  command  is  given 
in  line  B  (pg.  5)  of  the  accompanying  example.  The 

question  asked  is,   "WHAT  IS  THE  NEXT  STEP?".  In 
this  case,  the  correct  response  would  be  "RELPROF- 
IT", since  the  user  must  look  at  the  relative  prof- 
it vector.     The  user  has  typed  "HELP"  and  thus  re- 
ceives an  appropriate  explanation. 


Implementation  Description 

Programming  Language  Requirements 

The  system  was  programmed  in  FORTRAN- IV  on 
the  CDC  6500.     A  language  with  better  character 
string  handling  facilities  would  have  been  a  bettei 
choice.     However,  the  constraining  factor  was  the 
interactive  field  length  restriction  on  the  Purdue 
system.     Interactive  jobs  are  limited  to  a  maximum 
field  length  of  21K.     The  oDject  code  generated  by 
FORTRAN  was  able  to  fit  into  this  space  and  still 
allow  some  room  for  extensions.     On  an  IBM  system 
with  time-sharing,  APL  would  be  a  better  choice. 


Algorithmic  Description 


At  present  three  mathematical  programming 
algorithms  have  been  programmed  in  the  interactive 
system.     These  include  a  primal  simplex  and  a  dual 
simplex  algorithm  to  solve  linear  programs,  and  a 
branch  and  bound  algorithm  for  solving  integer 
linear  programming  problems.     Each  program's  logic 
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is  based  on  the  description  of  that  algorithm  given 
in  the  text  by  Phillips,  Ravindran,  and  Solberg 
[6].     Students  are  asked  to  input  problems  of  their 
own  choosing.     This  allows  for  great  flexibility 
in  the  instructional  problems  chosen  rather  than 
limiting  the  selection  to  just  a  few.     The  instruc- 
tor may  also  suggest  some  problems  which  demon- 
strate the  basics  of  the  algorithm.     After  master- 
ing the  basics,  the  students  can  experiment  with 
problems  of  their  own  choosing. 

Concluding  Remarks 

The  PSI  system  is  currently  being  well  re- 
ceived by  the  students.     The  authors  have  evalu- 
ated the  courses  by  several  measures:  comparison 
with  test  results  in  other  years,  student  election 
of  more  advanced  courses  in  these  areas,  student 
acceptance  as  indicated  on  written  course  cri- 
tiques, etc.     In  all  cases,  the  self-paced  method 
of  instruction  was  judged  in  general  to  be  superior 
for  this  particular  course.     (See  Table  1  for  stu- 
dent comments.)    This  is  not  to  say  that  the  usual 
problems  of  some  students  falling  behind  schedule 
did  not  exist,  because  they  did.     Effort  is  ex- 
pended during  the  semester  to  help  student  motiva- 
tion.    Also,  the  PSI  system  consumes  more  faculty 
time  initially  in  the  planning  of  the  units,  and 
preparation  of  study  guides  and  tests. 

Considerable  time  is  still  being  spent  on  an 
individual  basis  with  some  students  in  helping  them 
learn  the  optional  units  for  which  there  are  no 
formal  lectures.     To  offset  this  the  author  has  ob- 
tained another  research  grant  to  support  the  prep- 
aration of  video  cassette  tapes  for  the  optional 
units.     The  School  of  Industrial  Engineering  has 
recently  acquired  a  TV  monitor  and  a  cassette 
playback  unit.     These  video  tapes  can  be  played 
on  these  units  by  the  students  at  their  own  time. 
We  hope  to  have  these  tapes  prepared  by  the  end  of 
the  year  for  students'  use. 
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APPENDIX 

The  solution  to  the  following  linear  program 
by  the  interactive  simplex  algorithm  is  shown  as 
an  illustration: 

lyfeximize      z  =  lOx^  +  6x2  +  Ux^ 

Subject  to  •'^i       ^2  +    Xj  s  100 

lOx^  +  Uxg  +  5x^  s  600 

2x^  +  2x2  +  6x^  <  300 

x^,  Xg,  X  s  0 
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LINEAE  PROGRAMMING  SELF-PACED  INSTRUCTION 


ENTER  THE  NUMBER  OF  CONSTRAINTS  FOLLOWED  BY 
A  BLANK  FOLLOWED  BY  THE  NUMBER  OF  VARIABLES 

:  3  5 

ENTER  THE  LINEAR  PROGRAM 

IS  THE  OBJECTIVE  FUNCTION  TO  BE  MINIMIZED  OR 
MAXIMIZED? 

:  MAX 

ENTER  THE  C  VECTOR 

:  10  6  1+ 

ENTER  THE  A  MATRIX,  ROWWISE 

:   1  1  1 
:  10  U  5 
:  2  2  6 

ENTER  THE  B  VECTOR 
:  100  600  300 

ENTER  THE  CONSTRAINT  TYPES: 

<  FOR  LESS  THAN  OR  EQUAL  TO, 
>  FOR  GREATER  THAN  OR  EQUAL  TO, 
=  FOR  EQUALITY  CONSTRAINTS 

:  <  <  < 

(A)     IS  THE  PROBLEM  IN  STANDARD  FORM? 
:  YES 

INCORRECT  RESPONSE  - 

THE  PROBLEM  IS  NOT  IN  STANDARD  FORM 


NOT  ALL  OF  THE  CONSTRAINTS  ARE  EQUALITIES 


CHECK  THE  CONSTRAINTS  TO  SEE  IF  THERE  ARE  ANY 
NEGATIVE  RIGHT-HAND  SIDES.     ALL  SUCH  CON- 
STRAINTS NEED  TO  BE  MULTIPLIED  BY  -1. 
WHICH  CONSTRAINTS  NEED  TO  BE  MULTIPLIED  BY 
-1? 

TYPE  0  (zero)  if  none 
:  0 

for  EACH  CONSTRAINT,  TYPE  A  +  1,  -1,  OR  0 
DEPENDING  ON  WHETHER  A  SLACK  VARIABLE  IS 
NEEDED,  A  SURPLUS  VARIABLE  IS  WEEDED,  OR 
NEITHER  ONE  IS  NEEDED 

:  1  1  1 

THE  SLACK  AND  SURPLUS  VARIABLES  HAVE  BEEN 
INSERTED 

ARE  ARTIFICIAL  VARIABLES  NEEDED? 
:  YES 


INCORRECT  RESPONSE  - 

NO  ARTIFICIAL  VARIABLES  ARE  NEEDED  BECAUSE 
THERE  EXISTS  AN  INITIAL  FEASIBLE  BASIS 


THE  CURRENT  TABLEAU  IS: 

COST         10  6  1+  0  0  0 

XI  X2  X3  Xk  X5  X6  B 

BASIS 

J.k          1.00  1.00  1.00  1.00  .00  .00100.00 

X5      10.00  k.oo  5.00    .00  1.00  .00600.00 

X6  2.00    2.00    6.00       .00       .00    1.0  300.00 

WHAT  ARE  THE  VALUES  OF  THE  BASIC  VARIABLES 
IN  THE  ORDER  IN  WHICH  THEY  APPEAR  IN  THE 
TABLEAU 

:  100  600  300 

WHAT  IS  C(B),  THE  VECTOR  OF  COST  COEFFICIENTS 
OF  THE  BASIC  VARIABLES  IN  CORRESPONDING  ORDER? 

:  0  0  0 

THE  VALUE  OF  THE  OBJECTIVE  FUNCTION  =  .00 

(B)  what  IS  THE  NEXT  STEP? 

CHOOSE  FROM  PIVOT,  RATIOS,  OR  RELPROFIT 
IF  YOU  DONT  KNOW,  TYPE  HELP 

:  HELP 


THE  NEXT  STEP  OF  THE  SIMPLEX  ALGORITHM    IS  TO 
LOOK  AT  THE  RELATIVE  PROFIT  VECTOR  AND 
DETERMINE  WHETHER  THE  CURRENT  BASIC  FEASIBLE 
SOLUTION  IS  OPTIMAL.     DETERMINE  THE  PIVOT 
COLUMN  AND  HENCE  THE  VARIABLE  WHICH  ENTERS 
THE  BASIS  BY  CHOOSING  THE  VARIABLE  WITH  THE 
LARGEST  RELATIVE  PROFIT  COEFFICIENT.  IN 
ORDER  TO  SEE  THE  RELATIVE  PROFIT  VECTOR,  TYPE 
RELPROFIT. 


WHAT  IS  THE  NEXT  STEP? 

CHOOSE  FROM  PIVOT,  RATIOS,  OR  RELPROFIT 
IF  YOU  DONT  KNOW,  TYPE  HELP 

:  RELPROFIT 
THE  RELATIVE  PROFIT  VECTOR  IS: 
10.00    6.00    h.OO     .00     .00  .00 
IS  THE  TABLEAU  OPTIMAL? 

:  YES 

INCORRECT  RESPONSE  - 

FOR  A  MAXIMIZATION  PROBLEM,  THE  TABLEAU  IS 
OPTIMAL  WHEN  ALL  THE  RELATIVE  PROFIT  COEF- 
FICIENTS ARE  NON- POSITIVE 

SELECT  THE  PIVOT  COLUMN 

:  1 

WHAT  IS  THE  NEXT  STEP? 

CHOOSE  FROM  PIVOT,  RATIOS,  OR  RELPROFIT 
IF  YOU  DONT  KNOW,  TYPE  HELP 
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THE  NEXT  STEP  OF  THE  SIMPLEX  ALGOEITHM 
IS  TO  DETER^O:NE  THE  PIVOT  ROW  AMD  HENCE 
THE  VARIABLE  WHICH  LEAVES  THE  BASIS  USE  THE 
MINIMUM  RATIO  RULE  TO  DETERMINE  THE  BASIC 
VARIABLE  TO  IE  AVE  THE  BASIS.  TO  CALCULATE 
RATIOS  TYPE  RATIOS 


:  RATIOS 

FOR  EACH  ROW,  TYPE  A  1  IF  A  RATIO  NEEDS  TO 
BE  CALCULATED  OTHERWISE  TYPE  A  0 

:  0  0  0 

THE  RESPONSE  FOR  ROW  1  IS  INCORRECT 


THE  RESPONSE  FOR  ROW  2  IS  INCORRECT 


THE  RESPONSE  FOR  ROW  3  IS  INCORRECT 

THE  CORRECT  RESPONSE  IS:  111 

THE  RATIOS  ARE 

ROW  NO.  RATIO 

1  100.00 

2  60.00 

3  150.00 

CAN  A  PIVOT  OPERATION  BE  PERFORMED? 
:  NO 

INCORRECT  RESPONSE  - 

A  PIVOT  OPERATION  CAN  BE  PERFORMED  SINCE  AT 
LEAST  ONE  RATIO  HAS  BEEN  CALCULATED 


SELECT  THE  PIVOT  ROW 
:  3 

INCORRECT  RESPONSE  - 

THE  ROW  WITH  THE  SMALLEST  RATIO  IS  THE  PIVOT 
ROW 

SELECT  THE  PIVOT  ROW 
2 

WHAT  IS  THE  NEXT  STEP? 

CHOOSE  FROM  PIVOT,  RATIOS,  OR  RELPROFIT 
IF  YOU  DONT  KNOW,  TYPE  HELP 

HELP 


NOW  THAT  THE  PIVOT  ELEMENT  HAS  BEEN 
DETERMINED,  A  PIVOT  OPERATION  CAN  BE 
PERFORMED.     TYPE  PIVOT  TO  PERFORM  THE 
PIVOT  OPERATION 


WHAT  IS  THE  NEXT  STEP? 

CHOOSE  FROM  PIVOT,  RATIOS,  OR  RELPROFIT 
IF  YOU  DONT  KNOW,  TYPE  HELP 


:  PIVOT 


THE  CURRENT  TABLEAU  IS: 


COST  10 

6 

k 

0 

0 

0 

XI 

X2 

X3 

xk 

X5 

x6 

B 

BASIS 

Xh  .00 

.60 

.50 

1.00 

-.10 

.00 

ho 

00 

XI  1.00 

.1+0 

.50 

.00 

.10 

.00 

60 

00 

X6  .00 

1.20 

5.00 

.00 

-.20 

1.00 

180 

00 

WHAT  ARE  THE  VALUES  OF  THE  BASIC  VARIABLES 
IN  THE  ORDER  IN  WHICH  THEY  APPEAR  IN  THE 
TABLEAU 

:  i+0  60  180 

WHAT  IS  C(B),  THE  VECTOR  OF  COST  COEFFICIENTS 
OF  THE  BASIC  VARIABLES  IN  CORRESPONDING  ORDER? 

:  0  10  0 

THE  VALUE  OF  THE  OBJECTIVE  FUNCTION  =  6OO.OO 
WHAT  IS  THE  NEXT  STEP? 

CHOOSE  FROM  PIVOT,  RATIOS,  OR  RELPROFIT 
IF  YOU  DONT  KNOW,  TYPE  HELP 

:  RELPROFIT 

THE  RELATIVE  PROFIT  VECTOR  IS: 

.00      2.00      -1.00       .00       -1.00  .00 
IS  THE  TABLEAU  OPTIMAL? 
:  NO 

SELECT  THE  PIVOT  COLUMN 
:  2 

WHAT  IS  THE  NEXT  STEP? 

CHOOSE  FROM  PIVOT,  RATIOS,  OR  RELPROFIT 
IF  YOU  DONT  KNOW,  TYPE  HELP 

:  RATIOS 

FOR  EACH  ROW,  TYPE  A  1  IF  A  R^.TIO  NEEDS  TO  BE 
CALCULATED  OTHERWISE  TYPE  A  0 

:  1  1  1 

THE  RATIOS  ARE 
ROW  NO.  RATIO 

1  66.67 

2  150.00 

3  150. 00 

CAN  A  PIVOT  OPERATION  BE  PERFORMED? 
:  YES 

SELECT  THE  PIVOT  ROW 
:  1 

WHAT  IS  THE  NEXT  STEP? 

CHOOSE  FROM  PIVOT,  RATIOS,  OR  RELPROFIT 
IF  YOU  DONT  KNOW,  TYPE  HELP 
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PIVOT 


THE  CURRENT  TABLEAU  IS: 


COST 

10 

6 

k 

0 

0 

XI 

X2 

X3 

XU 

X5 

X6  B 

BASIS 

X2 

00 

1.00 

.88 

1.67 

-.17 

.0  66.67 

XI  1. 

00 

.00 

.17 

-67 

.17 

•0  33-33 

X6 

00 

.00 

I4.OO 

-2.00 

-.00 

1.0  100.00 

WHAT  ARE  THE  VALUES  OF  THE  BASIC  VARIABLES 
IN  THE  ORDER  IN  WHICH  THEY  APPEAR  IN  THE 
TABLEAU 

:  66.67    33.53  100.00 

WHAT  IS  C(B),  THE  VECTOR  OF  COST  COEFFICIENTS 
OF  THE  BASIC  VARIABLES  IN  CORRESPONDING  ORDER? 

:  6    10  0 

THE  VALUE  OF  THE  OBJECTIVE  FUNCTION  =  753 -53 
WHAT  IS  THE  NEXT  STEP? 

CHOOSE  FROM  PIVOT,  RATIOS,  OR  RELPROFIT 
IF  YOU  DONT  KNOW,  TYPE  HELP 

:  RELPROFIT 

THE  RELATIVE  PROFIT  VECTOR  IS: 

.00      .00      -2.67      -5-55      --67  .00 

IS  THE  TABLEAU  OPTIMAL? 

:  YES 


DO  YOU  WANT  TO  BEGIN  A  NEW  PROBLEM? 

:  NO 

+++LOG 

TOB  L02i+  ll,l+U.5l.  10/07/75. 
ESTIMATED  SESSION  COST  $  .17 
PLEASE  TURN  OFF  TERMINAL.  TNX. 
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Table  1.     Results  of  Student  Survey 

Summer  197*+  Summer  1975  Summer  I976 


No. 

of  Students  Enrolled 

U9 

No. 

of  Students  Responded 

39 

Reaction  to  the  PSI  System 

Yes 

No 

Neither 

Yes 

No    Neither  | 

Yes 

No  Neitl- 

a. 

Favorable  initial  response  (semester  beginning) 

~f: 

u 

1, 

PS 

0 

CO 

n 

y 

d. 

b. 

Favorable  final  response  (semester  end) 

18 

J 

n 

D 

n 

± 

c. 

Helped  me  to  achieve  desired  goals 

z 

J 

j- 

J  J 

1 

0 

30 

8 

1 

d. 

Able  to  plan  my  studies  better 

"I  V 

-L  1 

5 

0 

^9 

9 

1 

e. 

Helped  me  to  get  a  better  grade 

a 
0 

7 

7 

P7 

10 

2 

f . 

Would  have  preferred  traditional  format 

5 

18 

0 

1 

32 

3 

6 

30 

3 

g- 

Learned  more  under  PSI 

10 

7 

6 

2U 

5 

7 

2U 

11 

h. 

Worked  harder  under  PSI 

8 

13 

2 

18 

15 

3 

lt+ 

21 

it 

i. 

Worked  less  under  PSI 

12 

9 

2 

11 

21 

i+ 

lU 

21+ 

1 
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IMPORTANCE  OF  MODELLING 
FOR  INTERPRETATION  OF 
LINEAR  PROGRAMMING  PROBLEMS 


Leonard  W.  Swanson 
Northwestern  University 


Abstract 

The  construction  of  the  mathematical  model 
for  a  linear  programming  problem  requires  extreme 
care  in  order  that  it  be  effective.     If  one  uses 
as  few  variables  as  possible,  he  may  find  that  it 
is  efficient  in  computation  time  but  ineffective 
for  sensitivity  analysis.     This  paper  uses  a  re- 
latively uncomplicated  example  to  show  the  way  in 
which  proper  modelling  enables  one  to  extract  im- 
portant analyses  not  obtainable  through  a  simpler 
model.     The  approach  has  been  successfully  used 
in  teaching  linear  programming,  particularly  for 
explaining  the  concept  of  duality.     The  same 
methods  can  be  highly  effective  in  managerial 
applications . 

Introduction 

It  is  often  thought  that  one  is  being  very 
efficient  if  he  solves  a  Linear  Programming  Prob- 
lem by  using  the  fewest  possible  number  of  vari- 
ables.    If  one  is  interested  only  in  the  solution 


and  not  in  any  kind  of  sensitivity  analysis,  this 
may  be  true,  but  if  one  is  interested  in  the 
effect  of  a  variation  of  the  parameters  of  the 
problem,  many  more  details  may  be  required  to  en- 
hance the  analysis  and  interpretation. 

Scheduling  Problem 

In  order  to  consider  the  benefits  of  effi- 
cient modelling,  consider  the  production  schedul- 
ing problem  shown  in  Figure  1  (no  claim  is  made 
for  the  originality  of  the  example). 

A  plant  makes  two  products,  A  and  B,  which 
are  routed  through  four  processing  centers  1,  2, 
3,  and  4  as  shown  by  the  solid  lines  in  the  en- 
closed diagram.     If  there  is  spare  capacity  in 
center  3,   it  is  possible  to  route  product  A  throu 
3  (dotted  line)   instead  of  going  through  2  twice, 
but  this  is  more  expensive. 

Given  the  information  in  Table  1  and  Table  2 
how  should  production  be  scheduled  so  as  to  maxi- 
mize profit?     (By  production  schedule  is  meant 


PRODUCTION  SCHEDULING 


A  =^ 


Center  1 


Center  2 


Center  3 


Center  4 


I 
I 
! 


A 

-'  »-- 

'  (alternate) 


_  J 


Figure  1 
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the  specification  of  the  following  three  amounts; 

(1)  The  daily  amount  of  raw  material  used 
for  A,  regular  route, 

(2)  The  daily  amount  of  raw  material  used 
for  A,  optional  route, 

(3)  The  daily  amount  of  raw  material  used 
for  B. 

Assume  that  sufficient  storage  capacity  is  avail- 
able at  no  additional  cost.) 


Table  1 


Inputs , 
gals. 


Running 

cost 
per  hr. 


Center 


1 

300 

90 

150 

2(lst  pass) 

450 

95 

200 

A  - 

4 

250 

85 

180 

2(2nd  pass) 

400 

80 

220 

_3 

350 

75 

250 

~1 

500 

90 

300 

B  - 

3 

480 

85 

250 

4 

400 

80 

240 

Table  2 

Raw 

Material 

Sales  price 

Maximum  daily 

cost 

per  finished 

sales , 

gal.  of 

Product    per  gal. 

gal. 

finished  product 

A 

5 

20 

1700 

B 

6 

18 

1500 

Centers  1  and  4  run  up  to  16  hours  a  day; 
centers  2  and  3  run  up  to  12  hours  a  day.     A  final 
restriction  is  furnished  by  shipping  facilities, 
which  limit  the  daily  output  of  A  and  B  to  a  total 
of  2500  gallons. 

Consider  first  the  formulation  of  the  prob- 
lem in  a  form  in  which  as  few  variables  as  pos- 
sible are  used.  Therefore 

Let  X]^  =  the  number  of  gallons  of  input  of 
A,  which  is  to  follow  the  regular 
processing  route 

X2  =  the  number  of  gallons  of  input  of 

A,  which  is  to  follow  the  alter- 
nate processing  route 

x^  =  the  number  of  gallons  of  input  of 

B,  which  has  only  one  route. 

Figure  2  is  a  representation  of  the  problem 
incorporating  these  variables  and  also  the  infor- 
mation from  Tables  1  and  2. 

On  each  path  through  a  particular  center, 
there  are  two  numbers;  the  left  hand  number  is  the 
number  of  gallons  per  hour  that  can  be  processed 
and  the  right  hand  number  is  the  cost  per  hour  for 
processing. 

Mathematical  Statement  of  the  Problem 


Find  x^,  X2,        ^  0 


Such  that 

Center 
capacities 


5xj^  +  5x2      -^^3  =  24,000 
13.74075x]^  +  7.2x2  =  ^3,200 
34.884x2  +  31.5x3    -  201,600 
6.84xj^  +  6,84x2      3.825x3  g  32,000 


''l  .J3OO 


$150 


.9x, 


400 


$220 


Center  2 
450  $200 


1300 


$150 
Center  1 


500 


$300 


.9x2  i4fiQ. 


.5814x, 


,855xi r 


1.1250 


$180 


.855x, 


,9x. 


I 


Center  3 
350  $250 


,765x. 


250  $180 
Center  4 


400 


$240 


.72675ki 


7726751c- 

I 
1 
I 
I 

.612x  ' 



.5450625X. 


I 
I 
I 

I 

J 


(alternate) 


Figure  2 
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Sales 
Limits 


.5814x^  +  .5450625x2  i  1700 
.612x3  ^  1500 


.       r.5814x,  +  .5450625x„  +  .612x-  i  2500 
Capacity    |_  1  2  J 

and  such  that 

4.7162875x^  +  3.866543x2  +  3.48825x2 

is  a  maximum. 

Optimal  Solution 

In  this  form  of  the  problem,  one  can  get  the 
optimum  solution  for  carrying  out  the  production 
schedule.     The  solution  reveals  that 

1.  2923.98  gallons  follow  the  regular  route 
A  but  none  follow  the  alternate  route. 

2.  1307.19  gallons  follow  route  B. 

3.  The  profit  is  $18,399.59. 

4.  The  actual  output  of  either  A  or  B  is  not 
revealed  directly  although  it  can  be  ob- 
tained by  additional  simple  computation. 

5.  Through  the  dual  solutions  one  sees  that 
the  sales  restriction  on  A  is  a  limita- 
tion on  profit  and  that  Dock  Capacity  is 
also  a  limitation  on  profit.     The  per 
unit  gains  made  by  relaxing  these  con- 
straints are  given  by  the  dual  solutions. 

6.  Nothing  in  the  solution  is  directly  re- 
vealed concerning  the  effect  of  chang- 
ing either  the  rates  of  gallon  through- 
put for  each  of  the  centers  nor  the 
hourly  costs  of  processing  in  each  of 
the  centers. 


7.     A  study  of  the  effect  of  changing  the  co- 
efficients of  the  objective  function  is 
not  very  revealing  since  the  coefficients 
in  this  function  are  a  conglomeration  of 
a  number  of  incomes  and  costs. 

This  form  of  the  model  seems  to  be  relatively  . 
neat  and  one  might  even  pride  himself  in  the  brev- 
ity of  the  format  thinking  that  restricting  the 
problem  to  three  variables  and  seven  constraints 
is  a  minor  triumph. 

It  would  appear  that  a  manager  interested  in 
producing  the  most  effective  schedule  would  require 
a  sensitivity  analysis  which  would  study  the  effects 
of  changes  in  processing  rates  and  processing  costs 
for  each  of  the  centers.     Accordingly,   in  what  fol- 
lows I  develop  a  model  of  the  same  problem  which 
will  enable  the  manager  to  learn  a  vast  amount  more 
about  the  operation  and  which  would  enable  him  to 
do  a  better  job  of  effective  management. 

Alternate  Statement  of  the  Problem 

As  shown  in  Figure  3,  X]^  and  xg  are  used  to 
represent  the  raw  material  inputs  for  A  and  B  res- 
pectively.    These  variables  are  modified  by  re- 
covery rates  and  new  definitions  are  made  so  that 
variables  x^  to  y.-^2  designate  various  inputs  and 
outputs  in  the  processing  stream.     From  Figure 
it  can  be  seen  that  a  solution  for  x-7  gives  the 
final  output  of  A,  regular;  a  solution  for  xg  gives' 
the  final  output  of  A,  alternate;  and  a  solution 
for  X]^2  gives  the  final  output  of  B.  Furthermore, 
the  definition  of  these  variables  serve  as  con- 
straints which  involve  the  recovery  rates.     The  de- 
finition of  variables  X]^3  through  X20  serve  as  con- 
straints which  define  the  hourly  processing  rates 
of  the  centers.     The  recovery  and  hourly  processing 
definitions  and  corresponding  constraints  are  given 
below. 


13  300 
(a$150/hr 


-hrs 


18  500 


-hrj; 


(3$300/hr, 


16  400 
(a$220/hr. 


-hrs , 


X2 

'14-  450 


hrs , 


@$200/hr. 


^9=  4^^^"^^ 


(a$250/hr. 


I 

t  (3$250/hr. 
I 

1_  _  _  _  _  o 


11, 


15  250 
(a$180/hr. 


hrs . 


^20  =  455 


(?$240/hr. 


A  (regular) 


J 


A  (alternate) 


Figure  3 
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i 


I 


Recovery  definitions: 


^2  = 

^2  ■ 

.90x^ 

X3  = 

.95x2 

X3  - 

.95x2 

\  = 

.SSx^ 

-4  - 

.85x3 

^7  = 

.8OX5 

-7  - 

.8OX5 

^8  = 

.75x, 

D 

-8  ■ 

.75x, 

6 

.90Xg 

""lo  ■ 

.90Xg 

^11= 

.85x^0 

^1- 

^12" 

x^2  - 

•80-11 

Center 
Processing  times: 


13 

=  Xj^/300 

14 

=  X2/45O 

15 

=  X3/25O 

16 

=  x^/400 

17 

=  Xg/350 

18 

=  Xg/500 

19 

=  -10/^80 

20 

=  x^j^/400 

When  written 
as  constraints  ; 


When  written 
as  constraints  : 


300x^2  =  0 

450x, ,   =  0 
14 


Recovery 
Definitions 


^10 


^11 


-  250xj^j  =  0 

-  400xj^g  =  0 

-  350x^7  =  0 

-  500x, 0=0 

io 

480xj^g  =  0 
400x^0  =  0 


We  add  the  constraint  which  shows  the  balance 
for  A  regular  and  A  alternate  route  as 


-5  +  -6 


If  we  then  add  the  seven  constraints  of  our 
smaller  model  in  terms  of  the  above  variables 
we  have  a  model  with  twenty-four  constraints 
and  twenty  variables.     The  problem  Is  then 
stated  as  follows: 


Hourly 

Processing 

Rates 


Balance 

Center 
Capacities 


Sales 

Constraints 


Dock 
Capac  ity 


,90x. 


.95x„ 


^10 


^11 


.85x3  =  0 
.8OX2  =  0 
.75x,     =  0 


.90xg  =  0 
•85x,o  =  0 


^12 


-  .BOx 


11 


-1 

-  300x^^3 

=  0 

-2  ■ 

■  450x^^ 

=  0 

X3  ■ 

-  250x^2 

=  0 

-5 

-  400Xt, 
15 

=  0 

-6  ■ 

=  0 

-9 

■  500x^g 

=  0 

-10 

■  ^80x^g 

=  0 

-u 

-  4OOX2Q 

-  0 

4  5c 


^3  +  "18 


-14  -16 


-17  ^  -19 


-15  -20 


-7    +  -8 


^12 


X-,   +   X„    +  X 


=  0 

^  16 

g  12 

^  12 

i  16 

^  1700 
^  1500 

2500 


12= 


Find        to  X2Q  5  0 


such  that 


Such  that 

-5x-|^  +  20x7  +  20xg  -  6xg  +  18x^2  "  ^^Oxj^j  -  200x^^ 

-  180x,  .  -  220x,  ,  -  250x, -  300x,  _  -  250x,  „  -  240x^„ 
Lj  Id  1/  18  iy  zO 


is  a  MAXIMUM. 
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Scheduling  Parameters 


,  90x, 


The  actual  parameters  for  this  problem  in  either 
form  are: 


The  dual  solution  corresponding  to  this  constraint 
is 


1.  Sales  Limits  on  A  and  B 

2.  Dock  Capacity 

3.  Hourly  Capacities  for  centers  1,  2, 
3 ,  and  4 . 

4.  Cost  of  Raw  Material  Input  A  and  B 

5.  Income  from  Sales  of  A  and  B 

6.  Center  1 

Rate  of  processing  (Gals/hour) 
Cost  of  processing  ($/hour)  J 
Same  as  above  for  B 

7.  Center  2 
Rate  of  processing  (Gals/hour)  I    j^j.  ^ 


Cost  of  processing  ($/hour) 
Same  as  above 

Center  3 


for  A 


1st  pass 

for  A 
2nd  pass 


+  6.11111 

This  is  interpreted  to  mean  that  if  the  constraint 
were  written 

-  .90x^  =  b^ 

the  instantaneous  rate  of  change  in  profit  is 
6.11111  for  a  unit  increase  in  b^^. 

Since  the  recovery  rate  is  the  coefficient  of 
XI,  we  are  really  trying  to  do  a  sensitivity  study 
involving  a  non-linearity  but  good  approximations 
can  be  made.  Since 

=  .9x^  +  b^ 

it  can  be  seen  that  increasing  h-^  makes  it  pos- 
sible to  get  the  same  output  x^  with  less  input  x-j^ 
i.e.  an  increase  in  recovery  rate;  b^  is  measured 
in  gallons  and  the  rate  of  increase  in  profit  is 
in  dollars  per  gallon. 

If        is  increased  one  gallon,  it  will  result 
in  X]^  being  reduced  by  b-^/ .9  or  1.1111  b^^,  provid- 
ing X2  and  the  recovery  rate  remain  fixed.  This 
can  be  seen  since 


for  B 


for  A 


Rate  of  processing  (Gals/hour) I 

Cost  of  processing  ($/hour)  J 

Same  as  above  for  A  alternate 
route 

9.     Center  4 

Rate  of  processing  (Gals/hour) 
Cost  of  processing  ($/hour)  J 
Same  as  above  for  B 

Advantages  of  the  Second  Model 


When  using  the  short  form  of  the  problem,  we 
are  able  to  make  a  sensitivity  study  for  items  1, 
2,  and  3  only.     In  the  larger  model,  we  are  able 
to  analyze  all  items  on  the  list  [1]. 

Sensitivity  on  Recovery  Rates 

In  order  to  see  the  advantages  of  the  Second 
Model,  consider  the  effect  of  changing  the  re- 
covery rate  for  material  A  going  through  center 
1.     The  recovery  rate  for  this  process  is  ini- 
tially 90%.     It  is  rather  obvious  that  increas- 
ing the  recovery  rate  will  improve  the  profit  if 
nothing  else  is  changed,  but  what  are  the  effects 
quantitatively? 

Consider  the  definition  equation 

x^  =  .90x^ 

and  its  corresponding  constraint 


x^  =  .9(x^  -  A)  +  =  .9Xj 


,9x^  -  .9a  +  b^    =  .9Xj 


Change  in  Xj^  =  A  =  ""g     =  1.111  b^^ 

Therefore  the  savings  will  be  approximately 

1.1111  A  =  1.1111  (5  +  =  1.111(5.5)  =  6.1111 

where  the  $5  is  the  per  unit  savings  in  raw  mate- 
rial input  and  150/300  is  per  unit  savings  in  pro- 
cessing costs,  which  checks  with  the  dual  solution 

In  order  to  translate  this  to  a  change  of  1% 
in  the  recovery  rate,  a  good  approximation  for  the 
effects,  for  small  values  at  least,  is  given  by 
changing  b^^  by  an  amount  equal  to  17„  of  x^^,  since 


.91x, 


leads  to  having 


,  9xj^  + 


.Olx, 


Using  the  constraint 

x^  =  .9x^  +  b^ 

and  allowing  h-y  to  change  by  an  amount  equal  to 
.Olx]^  is  not  equivalent  to  using  the  constraint 


,91x. 
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To  illustrate: 


Our  solution  for  our         {       ~  2924.0 


problem  yields 


=  2631.6 


l-Jhether  we  use  a  change  in  h-^  or  a  change  in  re- 
covery rate  of  90°A,  we  produce  2631.6  as  the  out- 
put of  center  1.     A  reduction  of  1%  in  input  of 
29.240  substituted  for  h-^  yields, 

2631.6  -  .9x^  -  29.240  =  0 

or  -  2891.51 

processed  at  907=  recovery  rate. 

If  we  use  the  constraint 


then 


-  .9lx^  =  0 


=2891.87 


a  difference  of  processing  of  .36  gallons  @$6.11 
or  approximately  $2.19. 

Thus  if  we  use  changes  in        at  1%  of  xi  and 
use  the  dual  value  for  the  constraint,  the  expected 
improvement  in  profit  is 

(.01)   (2924)   (6.1111)  =  178.69 

The  new  profit  would  then  be 

18,339.59  +  178.69  =  18,518.28 

A  solution  of  the  problem  at  917o  recovery  rate 
yields  a  profit  of 

$18,516.32 

The  error  is  small  but  importantly  the  changes  in 
profit  are  in  the  right  direction  and  the  manager- 
ial implications  of  a  change  in  recovery  rate  are 
of  the  correct  orders  of  magnitude. 

In  a  similar  fashion  all  other  recovery  rate 
changes  can  be  assessed  by  using  the  dual  solu- 
tions.    It  is  only  necessary  to  change  the  input 
to  any  center  by  17„,  then  multiply  by  its  corres- 
ponding dual  value  to  get  an  estimate  of  the 
effect  of  a  change  of  17„  in  absolute  value  of  any 
recovery  rate. 

If  one  wished  to  find  a  more  exact  effect  of 
the  change  of  recovery  rate,  it  must  be  done  by 
finding  the  effect  of  changing  the  proper  coeffi- 
cient in  a  constraint.     For  the  example  we  have 
studied  above  this  amounts  to  changing  a  coeffi- 
cient for  the  variable  x^,  which  is  a  basic  vari- 
able.    The  model  presents  the  necessary  informa- 
tion for  this  study,  but  it  is  much  more  detailed. 

Sensitivity  on  Processing  Costs 

In  order  to  study  the  effects  of  processing 
rates,  we  examine  the  dual  solutions  correspond- 
ing to  constraints  9  through  15.     For  example, 


if  the  rate  of  processing  for  center  1  were  in- 
creased from  300  to  301  gallons  per  hour,  the  dual 
solution  indicates  an  improvement  of  50c  in  profit. 

For  any  of  the  above  cases  it   is  necessary  to 
study  the  ranging  of  the  right  hand  side  to  make 
sure  that  there  are  no  problems  associated  with 
degeneracy . 

Another  part  of  the  analysis  would  assess  the 
effects  of  changing  processing  costs  for  each  of 
the  centers.     This  can  be  done  directly  by  a  study 
of  the  ranging  of  the  objective  function  for  vari- 
ables 


Xj^2  through  x 


20 

The  effects  of  changes  in  raw  material  costs 
are  given  by  looking  at  variables 

x^  and  Xg 

and  the  effects  of  changes  in  selling  price  are 
given  by  looking  at  variables 

x^,  Xg  and  x^^^- 

It  should  be  evident  that  the  more  detailed 
model  provides  much  more  managerial  information. 

Effects  on  the  Computer 

The  smaller  model  resulted  in  three  variables 
and  seven  constraints  and  for  the  Computer  Program 
used  [2]  at  Northwestern  University's  Computer 
Center  required  .01  seconds  of  computing  time. 
The  larger  model  resulted  in  44  variables  and  24 
constraints  and  required  .1940  seconds  of  comput- 
ing time. 

The  program  used  had  capabilities  of  sensi- 
tivity analysis.     When  use  was  made  of  Ranging 
both  the  Right  Hand  Side  and  the  Coefficients  of 
the  Objective  Function  the  total  computing  times 
were  .652  seconds  and  2.673  seconds  respectively. 

Memory  requirements  were  obviously  greater 
for  the  larger  model  but  were  no  threat  to  the 
solution  of  the  problem. 

The  increase  in  both  storage  requirements  and 
computing  times  were  not  insignificant  but  it  is 
clear  they  do  not  impose  difficulties  which  can 
not  be  overcome.     Should  the  modelling  of  a  prob- 
lem along  the  lines  illustrated  in  this  paper  lead 
to  excessive  demands  on  either  memory  or  computing 
time,  the  problem  can  be  partitioned  so  as  to  re- 
lieve the  excessive  demands  without  diminishing 
the  advantages  discussed  in  the  paper,  although 
more  than  one  computer  run  might  have  to  be  made. 

In  this  example  satisfactory  reductions  can 
be  made  by  eliminating  the  definitions  and  con- 
straints in  terms  of  the  remaining  variables. 
This  reduces  the  problem  to  one  of  28  variables 
and  16  constraints  and  allows  one  to  make  the  sen- 
sitivity analysis  on  the  recovery  parameters. 

One  can  by  similar  means  eliminate  other  por- 
tions of  the  problem  and  thus  concentrate  on  one 
or  two  sections  at  a  time. 
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The  summary  computer  output  for  each  of  the 
problems  discussed  in  the  paper  are  given  in  the 
Append  ix. 

Cone lus  ions 

The  basic  conclusion  Chat  can  be  reached  is 
that  substantial  gains  can  be  made  in  facility  for 
interpretation  by  proper  modelling  of  the  problem. 
These  gains  are  made  at  a  cost  of  increased  memory 
requirements  for  the  computer  and  increased  comput- 
ing time. 

Although  consideration  has  been  given  to  es- 
tablishing rules  for  the  way  in  which  the  problem 
should  be  modelled  for  such  efficiency,  no  com- 
plete set  has  yet  been  established.     It  remains 
somewhat  of  an  art  or  perhaps  a  great  amount  of 
foresight.     In  the  examples  used,  an  increase  in 
both  the  number  of  variables  and  the  number  of  con- 
straints led  to  these  efficiencies  in  interpreta- 
tion but  such  is  not  always  the  case.     There  are 
other  cases  where  the  gains  "are  made  by  reducing 
both  the  number  of  variables  and  number  of  con- 
straints . 

Complete  computer  outputs  are  not  included 
because  of  the  space  required.     These  are  available 
however . 

Appendix 
Small  Problem 
Summary  of  Results 

Oppor- 


Var 
No 

Var 
Name 

Row 
No 

Status 

Activity  Level 

tunity 
Cost 

1 

''l 

B 

2923 

9766082 

2 

^2 

NB 

.5516015 

3 

^3 

B 

1307 

1895425 

4 

Slack 

1 

B 

5458 

5483316 

5 

Slack 

2 

B 

3022 

3684211 

6 

Slack 

3 

B 

160423 

5294118 

7 

Slack 

4 

B 

7000 

0000000 

8 

Slack 

5 

NB 

2.4060028 

9 

Slack 

6 

B 

700 

0000000 

10 

Slack 

7 

NB 

5.6997549 

Maximum  Value  of  the 

Objective  Function  =  18339.591933 


Large  Problem 


Var 
No 

Var 
Name 

Row 
No 

Status 

Activity 
Level 

Opportunity 
Cost 

1 

^1 

B 

2923.9766082 



2 

X2 

B 

2631.5789474 



3 

X3 

B 

2500.0000000 



4 

^4 

-- 

B 

2125.0000000 

-- 

5 

^5 

B 

2125.0000000 

6 

^6 

-- 

B 

0.0000000 

__ 

7 

^1 

B 

1700.0000000 

8 

^8 

-- 

B 

0.0000000 

-- 

9 

Xg 

B 

1307. 1895425 

-- 

10 

^10 

B 

1176.4705882 

11 

^11 



B 

1000.0000000 



12 

^12 

B 

800.0000000 

13 

^13 

B 

9.7465887 

14 

^^14 

B 

5.8479532 

15 

^15 

-- 

B 

10.0000000 

-- 

16 

^16 

B 

5.3125000 

17 

^1  7 

NB 



265.6492411 

18 

^18 



B 

2.6143791 



19 

^19 

B 

2.4509804 

1 

20 

^20 

B 

2.5000000 

1 

21 

Artif 

1 

NB 

6.1111111  ' 

22 

Artif 

2 

NB 



6.9005848  J 

23 

Artif 

3 

NB 

8.9653939  " 

24 

Artif 

4 

NB 



11.8942423 

25 

Artif 

5 

NB 



11.8942423 

26 

Artif 

6 

NB 

-- 

7.3333333  \ 

27 

Artif 

7 

NB 

9.2401961  ' 

28 

Artif 

8 

NB 

12.3002451  1 

29 

Artif 

9 

NB 

.5000000  1 

30 

Artif 

10 

NB 

— 

,4444444  ' 

31 

Artif 

11 

NB 



.7200000  . 

32 

Artif 

12 

NB 



.5500000 

33 

Artif 

13 

NB 

_  _ 

-.0447121  ; 

34 

Artif 

14 

NB 

_  _ 

.6000000  ' 

35 

Artif 

15 

NB 

-- 

.5208333 

36 

Artif 

16 

NB 

.6000000 

3  7 

Artif 

17 

NB 

38 
39 
40 

Slack 
SI  ack 
Slack 

18 
19 
20 

B 
B 
B 

3 . 6390322 
.8395468 
9.5490196 

1 

41 

Slack 

21 

B 

3.5000000 

42 

Slack 

22 

NB 

43 

Slack 

23 

B 

700.0000000 

44      Slack  24 
Maximum  Value  of 

NB 

Objective  Function  = 

5.6997549 
18339.591933 

300 


References 


1.  Hadley,  Linear  ProRrainminR ,  Addison-Wesley , 
1962. 

2.  Cohen,  Stein,  Multi  Purpose  Optimization  System, 
Vogelback  Computing  Center,  Northwestern  Uni- 
versity, 1975. 


301 


INTERACTIVE  COMPUTER  CODES  FOR  MATHEMATICAL  PROGRAMMING  EDUCATICN 


Rc±iert  P.  Davis  and  James  W.  Chrissis 
Department  of  Industrial  Engineering  and  Operations  Research 
Virginia  Polytechnic  Institute  and  State  University 
Blacksburg,  Virginia 


ABSTRACT 

This  paper  describes  the  role  of  interactive 
corputer  programs  in  mathematical  programming 
education.    Three  different  categories  of  inter- 
active programs  are  described  and  appropriate 
environments  are  suggested  in  uhich  each  should 
yield  the  greatest  utility  in  allowing  a  student 
to  enhance  his  ccnputational  experience  with,  and 
conceptual  understanding  of,  an  algorithm.  An 
exanple  execution  for  a  program  representative  of 
each  of  these  categories  is  given  for  illustrative 
purposes . 

INTRODUCTION 

There  are  essentially  three  problem  solving 
nethods  enployed  in  gaining  caiputational  experi- 
ence with  mathematical  prograimdng  algorithms: 

(1)  Manual  ccnputation. 

(2)  Execution  of  existing  "canned"  programs 
(e.g.,  batch-oriented  routines). 

(3)  Execution  of  interactive  codes  for 
existing  algorithms  (e.g. ,  data  input  -> 
decision  input      solution  output) . 

Either  manual  corputation  or  execution  of  existing 
batch-oriented  routines  is  the  method  most  fre- 
quently anployed. 

Manual  corputation  is  useful  in  providing 
^plication  experience  to  small  problems,  but 
even  in  such  cases  can  be  quite  time  consuming. 
In  addition,  it  should  be  noted  that  minor 
mathanatical  errors  may  occur  yielding  erroneous 
results  even  though  the  cotputational  methodol- 
ogy is  correct.    For  this  reason  manual  ccnpu- 
tation frequently  degenerates  to  mere  number 
manipulation  with  an  abandonment  of  all  attenpts 
at  conceptual  interpretation.    At  the  other  end 
of  the  spectrum,  the  execution  of  batch-oriented 
routines  can  result  in  little  or  no  understanding 
of  the  structure  of  an  algorithm  being  enplcyed  to 
cbtain  a  solution.    Ihe  method  favored  by  the 
authors  is  one  of  developing  and  ijirplementing 
interactive  codes  for  existing  algorithms.  This 
approach  is,  admittedly,  an  atteitpt  at  reaching  a 
catpromse  between  manual  methods  and  batch- 
oriented  routines.    Interactive  corputing  can  be 
a  very  effective  medium  for  reinforcing  the 


learning  process  in  mathematical  programming 
education.    Further,  its  irrplementation  can 
significantly  reduce  the  time  required  for  the 
student  to  gain  ccirputational  experience  with 
existing  algorithms. 

DISCUSSION 

Interactive  corputer  algorithms  can  be 
separated  into  three  broad  categories: 

(1)  Initial  data  entry- 

(2)  Interrrediate  interaction  (limited 
decision  inputs) . 

(3)  Intimate  interaction  in  the  algorithmic 
process . 

Interactive  programs  requiring  only  initial  input 
data  can  be  quite  useful  in  a  problem  solving 
environment  \4iere  time  is  a  factor  and  the  student 
is  already  familiar  with  the  algorithm's  operation. 
The  major  utility  fron  such  programs  lies  in  their 
accessibility  and  capacity  to  provide  results  v*iich 
can  be  used  for  interpretation  of  model  charact- 
eristics and  relevance.    However,  such  algorithms 
are  not  useful  in  reinforcing  the  solution  ireth- 
odology  they  eirploy.    One  exanple  of  such  an 
algorithm  is  an  interactive  routine  for  solving 
linear  programming  problems.    With  this  algorithm, 
the  student  enters  the  necessary  prcblem  data 
(effectiveness  coefficients,  irput-output  matrix 
and  requirements  vector)  and  the  program  returns  a 
solution.    The  solution  can  then  be  used  to  exa- 
mine model  relevance  or  can  provide  a  beginning 
for  conducting  postcptimality  analysis.  Exanple 
1  illustrates  an  execution  sequence  for  such  an 
algorithm  (2) . 

Algorithms  which  eirploy  limited  interaction 
during  the  solution  process  are  typically  those 
which  require  the  student  to  enter:    values  for 
decisions  variables,  request  or  abort  sensitivity 
analyses,  or  indicate  v*iether  cptimality  has  been 
achieved.    As  an  illustration  of  such  algorithms, 
Exanple  2  shows  an  execution  sequence  for  a 
dynamic  programming  algorithm.    This  algorithm  is 
for  serial  systems  with  linear  returns  function 
and  linear  state  transitions  in  a  single 
variable  at  each  stage.    Again,  such  algorithms 
do  not  fully  reinforce  solution  irethodology,  but 
they  can  si:pport  an  awareness  of  an  algorithm's 
decision  process  as  well  as  illustrate  inherent 
characteristics  of  the  irrxiels  to  which  they  apply. 

Finally,  there  are  those  programs  which 
require  an  intimate  interaction  by  the  student  in 
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the  algorithm's  process.    With  such  routines  the 
student  must  provide  the  decision  information 
necessary  for  the  algorithm  to  progress.    In  this 
way,  the  algorithm's  structure  is  emphasized  and 
a  more  cotplete  awareness  of  its  operation  is 
required.    Further,  an  opportunity  for  reinforcing 
a  conceptual  understanding  and  interpretation 
of  the  algorithm  are  provided.    To  illustrate 
such  programs,  Exanple  3  shows  an  execution 
sequence  for  an  interactive  linear  programming 
algorithm  which  not  only  requires  that  initial 
problem  data  be  given,  but  further  that  eadi 
pivoting  step  in  the  interactive  process  be 
defined,  and  cptimality  and  feasibility  be  identi- 
fied. 

With  such  interactive  programs  the  major 
burden  of  mathematical  manipulation  is  absorbed 
by  the  code.    The  student  is  required  to  make 
key  decisions  throughout  the  algorithm  and  is 
provided  time  to  examine  the  consequences  of  these 
decisions.    It  should  be  noted  that  the  student 
is  given  an  opportunity  to  observe  the  advantages 
of  machine  conputation  in  algorithmic  operation; 
and  further,  interactive  codes  are  easily 
structured  to  allow  the  student  to  observe,  in 
the  code  itself,  the  caiputational  building  blocks 
viiich  conprise  the  algorithm  he  enploys. 

Interactive  programs  of  this  last  category 
can  be  further  extended  to  provide  error  detect- 
ion mechanisms.    These  error  detections  take  one 
of  two  forms.    First,  there  are  those  which 
indicate  an  incorrect  decision  but  do  not  identify 
what  should  have  been  the  correct  response.  The 
other  not  only  indicates  an  error  but  also  pro- 
vides the  correct  response  (for  exanple,  in  an 
interactive  LP  code,  a  check  could  be  made  on 
whether  the  minimum  non-negative  theta  value  was 
selected  in  identifying  the  pivot  row) .    It  is  the 
opinion  of  the  authors  that  error  checking 
mechanisms  of  this  latter  category  are  inappropriate 
since  they  eliminate  an  inportant  aspect  of  the 
learning  experience — that  of  identifying  and 
correcting  erroneous  decisions  made  in  applying  a 
solution  methodology.    In  fact,  we  question  the 
use  of  any  form  of  error  detection  since  they  do 
not  permit  the  student  to  observe  the  consequences 
of  an  erroneous  decision  (with  the  LP  exarrple 
suggested  above,  a  non-positive  element  in  the 
solution  basis) . 

This  last  point  brings  us  to  an  important 
question  vdiich  comes  to  the  mind  of  the  instructor 
who  seeks  to  implement  interactive  ccnputing  in 
a  teaching/learning  activity — "What  is  the 
appropriate  level  (category)  of  interaction  for 
the  activity  at  hand?".    For  this,  we  have  no 
absolute  response.    In  general,  we  can  state 
that  the  appropriate  category  of  interaction  is 
a  function  of  the  student's  familiarity  with  an 
algorithm  and  the  purpose  of  applying  the  program. 
If  the  student  is  in  the  process  of  learning 
the  structure  of  an  algorithm,  the  last  of  these 
categories  is  the  most  appropriate.    If  he  is 
familiar  with  the  methodology  and  seeks  a  quick 
solution  to  a  particular  model  to  examine  its 
relevance  or  desires  to  initialize  a  solution  to 
conduct  sensitivity  analysis,  then  the  first 
category  should  suffice.    It  remains  for  a  com- 
prehensive investigation  to  provide  definitive 
guidance  to  the  instructor  in  assisting  him  to 
identify  an  appropriate  code  for  his  specific 
activity.    What  must  be  recognized  is  that  there 


exists  more  than  one  level  of  interaction  to 
which  such  programs  can  be  brought  and  that  an 
appropriate  level  for  one  activity  may  not  be 
appropriate  for  another. 

CONCLUSICN 

Interactive  ccttputer  codes  can  be  enployed 
with  utility  to  instruction  in  mathematical 
progranming  education.    These  codes  can  be 
divided  into  three  basic  categories  with  each 
representing  a  different  level  (or  intimacy)  of 
interaction.    The  appropriate  category  for  a 
particular  teaching/learning  activity  is  not 
definitively  established  but  is  postulated  to 
be  a  function  of  the  student's  background  and 
the  purpose  for  v^ich  the  algorithm  is  being 
employed.    As  the  purpose  for  applying  the 
algorithm  tends  to  be  one  of  reinforcing  the 
student's  coirprehension  of  a  solution  methodology 
and  c»ncsptual  interpretation,  the  intimacy  of 
interaction  manifested  in  the  code  should  increase. 
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EXAMPLE  1 


EXAMPLE   FROM  SECTION  3.8 


THE  ORIGINAL  COEFFICIENTS  OF  THE  CONSTRAINTS 

CODE  0  ==>  <0R=  CONSTRAINT 
COOE  1  ==>  >0R=  CONSTRAINT 
CODE   2   '^  =  >     '  CONSTRAINT 


I  COOE   CONSTANT     All.t)      Alt, 2)     All,})      All, 4)     All, 5)     All, 6)     All, 7)  Alt.BI 


I 

0 

30.00 

2.00 

"i.oo 

2 

0 

3.00 

2.00 

3 

I 

3.00 

1.00 

I. 00 

THE  COEFFICIENTS    IN   THE  ORIGINAL  OBJECTIVE  FUNCTION  TO  BE   MINIMIZED  ARE: 
-6.00  -'V.OO 


BASIC  SOLUTION 


XBl  11= 
XBI  21= 
XBl  3)= 


5)  = 
61  = 


30.00 
2^^.00 
3.00 


CURRENT   VALUE  OF   THE  OBJECTIVE  FUNCTION  IS 


-0.30000000E*0<. 


BASIC  SOLUTION 


XBl  I  I 
XBl  2) 
XBl    31=   XI  l» 


XI  «)  = 
XI    5)  = 


24.00 
15.00 
3.00 


CURRENT   VALUE  OF   THE  OBJECTIVE  FUNCTION  IS 


0.18000000E*02 


BASIC   SOLUTION  3 

XBl    1  )  =   XI   'Vl=  l-V.OO 
XBl    2)=   XI    31=  5.00  ■ 

XBl    3)=   XI    11=  8.00 

CURRENT   VALUE   OF  THE  OBJECTIVE  FUNCTION   IS         O.'iSOOOOOOE ♦OZ 


THE   LAST   BASIC   FEASIBLE   SOLUTION   IS  OPTIMAL 

OPTIMAL  VALUE  OF  THE  ORIGINAL  OBJECTIVE  FUNCTION   IS  -*B.OO 
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EXAMPLE  2 


TYPICAL  STAGE  DESCRIPTION: 


Stage  Decision 
d(  I  )<s(  I  )>  •  K( I  )*s(l  ) 
0<-K(  I  X-1 


Input  State        :  :  Output  State 
  s(l)  >:  '                    :— s( l-l)-A*s( I  )  +  B*d( I )<s( I )>  --> 


Stage  Returns 
r( I )=C»s(l  )  +  0»d(l  )<s(I  )> 

Cumulative  Returns 
f  (  I  )t(  1  )  +  f  (  I  -l)<s(i-l)> 

If  the  problem  Is  to  Maximize  enter  a  "1"  for  L;  otherwise,  enter  a  "0"  to  Minimize. 

L 

J 

Enter  the  state  transformation  parameters:  A, B  as  requested. 
A 

^1 
B 

a.* 

Enter  the  returns  function  parameters:  C,D  as  requested, 
C 

Ji 
D 

_-.  2 

Enter  the  number  of  stages(N),  up  to  a  maximum  of  5. 
N 

h. 

Stages  to  go:  1 

Enter  a  value  for  K(   1)  which  will  maximize: 
f(   1)"     0  .  5000»s(   n  ♦  (  -0.2000)*IC(   l)»s(  1) 
K(l) 
1 


Stages  to  go:  2 

Enter  a  value  for  K(  2)  which  will  maximize: 

f(  2)-  0.8500»s(  2)  ♦  (  0.0000)*K(  2)*s(  2) 
K(2) 

a. 


Stages  to  go:  3 

Enter  a  value  for  K(  3)  which  will  maximize: 
f(  3)-     1.0950*s(   3)  ♦  (     O.lUOOj'KC  3)«s(  3) 
K(3) 
1 


Stages  to  go:  i» 

Enter  a  value  for  K(  I4 )  which  will  maximize: 
f(  U)-     1.36i45*s(  U)  ♦   (     0.29l40)*IC(  U)»s(  l») 
K(i») 
i 


DECISION  SUMMARY 


Stages  to  go  Decision 

1  0.00 

2  0.00 

3  1.00 
It  1.00 


TOTAL  RETURN  -      1.6S85  s(  U) 
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EXAMPLE  2  (cont.) 


Enter  the  value  of  s(  ii )  as  requested. 
s(U) 


SYSTEM  PERFORMANCE 
K-  1 


1.00 


1.10 


f  O.JO 


1.10 


TOTAL  RETURN  ■  1.66 

This  terminates   the  procedure. 
*•       111.  XEQ  "STOP". 
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EXAMPLE  3 


DO  YOU  KNOW  HOW  TO  USE  THIS  PROGRAM? 
?  N 

THIS  IS  AN  INTERACTIVE  LINEAR  PROGRAMMING  ROUTINE 

YOU  MUST  HAVE  YOUR  PROBLEM  FORMULATED  AS  A  MAXIMIZATION 

PROBLEM. 

THIS  PROGRAM  WILL  ACCEPT  >,<  OR  =  CONSTRAINTS. 
THE  TYPE  01='  CONSTRAINT  MUST  BE  INDICATED  TO  THE  PROGRAM. 
THE  INDICATORS  ARE:  L  FOR  <= ,  G  FOR  >= ,  AND  E  FOR  =. 
ALL  >=  CONSTRAINTS  ARE  CONVERTED  TO  <=  CONSTRAINTS. 
ARE  YOU  READY  TO  USE  THIS  PROGRAM  TO  OBTAIN  A  SOLUTION? 
?  Y 

ENTER  THE  NUMBER  OF  CONSTRAINTS  FOLLOWED  BY  THE 
NUMBER  OF  STRUCTURAL  VARIABLES.   SEPARATE  THE  NUMBERS 
WITH  A  COMMA  ( , ) . 
?  3,4 

ENTER  THE  COEFFICIENTS  OF  THE  OBJECTIVE  FUNCTION 

C(l)  ,  ,C(N) 

?  3,4,5,1 

ENTER  THE  INDICATOR  FOR  THE  TYPE  OF  CONSTRAINT, 

L  FOR  <=,  G  FOR  >= ,  E  FOR  =,  ON  ONE  LINE 
FOLLOWED  BY  THE  LEFT-HAND  SIDE  CONSTRAINT  COEFFICIENTS 
ONE  ROW  AT  A  TIME:A(1,1),   ...    ,A(l,N),  ONE  PER  LINE. 
?  L 

?  2,1,3,1 


?  L 

?  1,2,4.2 

?  L 

?  3,2,1,1 


ENTER  THE  VECTOR  OF  RIGHT-HAND  SIDE  COEFFICIENTS 

B(l),  ,B(M). 

?  18,26,30 


CURRENT  BASIC  SOLUTION: 


ROW  VARIABLE  VALUE 

1  5  18 

2  6  26 

3  7  30 
OBJECTIVE  FUNCTION  =  0 

IS  THIS  SOLUTION  FEASIBLE? 
?  Y 

NON-BASIC  Z(J)-C(J) 

VARIABLE  VALUE 

1  -3 

2  -4 

3  -5 

4  -1 
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EXAMPLE  3  (cont.) 


IS  THE  SOLUTION  OPTIMAL  ? 
?  N 

ENTER  THE  NUMBER  OF  THE  NON-BASIC  VARIABLE 
YOU  WANT  TO  BRING  INTO  THE  SOLUTION. 
?  3 

BASIC  PIVOT  R.H.S. 

VARIABLE  COLUMN  VALUE 

5  3  18 

6  4  26 

7  1  30 
ENTER  THE  NUMBER  OF  THE  BASIC  VARIABLE 
TO  LEAVE  THE  SOLUTION. 

?  5 

Z(J)-C(J) 
VALUE 
. 333333 
-2. 33333 
1.66667 
.666667 


NON-BASIC 
VARIABLE 
1 
2 
'  5 
4 

IS  THE  SOLUTION  OPTIMAL  ? 
?  N 

ENTER  THE  NUMBER  OF  THE  NON-BASIC  VARIABLE 
YOU  WANT  TO  BRING  INTO  THE  SOLUTION. 
?  2 

BASIC  PIVOT  R.H.S. 

VARIABLE  COLUMN  VALUE 

3  .333333  6 

6  .666667  2.00002 

7  1.66667  24 
ENTER  THE  NUMBER  OF  THE  BASIC  VARIABLE 
TO  LEAVE  THE  SOLUTION. 

?  6 

NON-BASIC  Z(J)-C(J) 
VARIABLE  VALUE 

1  -5.5 

6  3.5 
5  -3 

4  2.99999 
IS  THE  SOLUTION  OPTIMAL  ? 

?  N 

ENTER  THE  NUMBER  OF  THE  NON-BASIC  VARIABLE 
YOU  WANT  TO  BRING  INTO  THE  SOLUTION. 
?  1 

BASIC  PIVOT  R.H.S. 

VARIABLE  COLUMN  VALUE 

3  1.5  5 

2  -2.5  2.99997 

7  6.5  19 


THETA 
VALUE 
6 

6.5 
30 


THETA 
VALUE 
18. 

3.00002 
14.4 


THETA 
VALUE 

3.33333 
-1.19999 

2.92308 
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EXAMPLE  3  (cont.) 


ENTER  THE  NUMBER  OF  THE  BASIC  VARIABLE 
TO  LEAVE  THE  SOLUTION. 
?  1 

NON-BASIC  Z(J)-C(J) 
VARIABLE  VALUE 

7  .846154 

6  1.38461 

5  -.461538 

4  2.15384 
IS  THE  SOLUTION  OPTIMAL  ? 
?  N 

ENTER  THE  NUMBER  OF  THE  NON-BASIC  VARIABLE 
YOU  WANT  TO  BRING  INTO  THE  SOLUTION. 


?  5 

BASIC  PIVOT                        R.H.S.  THETA 

VARIABLE  COLUMN                      VALUE  VALUE 

3  .307692  .615385  2 

2  -.846154  10.3077  -12.1818 
1  .461538  2.92308  6.33333 

ENTER  THE  NUMBER  OF  THE  BASIC  VARIABLE 
TO  LEAVE  THE  SOLUTION. 
?  3 

NON-BASIC  Z(J)-C(J) 

VARIABLE  VALUE 

7  .499999 

6  1.5 

3  1.5 

4  2.5 


IS  THE  SOLUTION  OPTIMAL  ? 
?  Y 

FINAL  SOLUTION  VALUES  AT  TERMINATION 
CURRENT  BASIC  SOLUTION: 


ROW  VARIABLE  VALUE 

15  2 

2  2  12 

3  12. 


OBJECTIVE  FUNCTION  =  54. 
R;  T=0.52/3.95  15:28:16 
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A  PROBLEM  SOLVING  SYSTEM  FOR  NOSILINEAR  LEAST  SQUARES 
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I .  INTRDDUCTICN 

In  this  paper,  we  discuss  a  system  for  solving 
unconstrained  nonlinear  least  squares  problems. 
The  problem  is  defined  as  follows: 

M  2 
min  F  =  min    I  [f  {x  ,x  , . . . ,x  ) ] 

X  X     i=l     l     -L     ^  N 

v^ere  the  f .  are  nonlinear  functions  of  the  param- 
eters Xj ,X2, . - • ,Xxj.    A  special  case  of  this  problem, 
of  great  practical  iirportance,  is  the  nonlinear 
regression  problem,  ^^i^ere  the       represent  the 
residuals  obtained  by  fitting  a  nonlinear  model  to 
experimental  data. 

The  motivation  behind  the  development  of  tMs 
systan  is  to  provide  the  user  with  a  broad  range 
of  facilities  which  he  can  activate  to  ultimately 
enable  him  to  obtain  a  solution  with  a  minimum 
expenditure  of  canputer  time  and  his  own  time.  We 
subscribe  to  and  atterrpt  to  extend  the  philosophy 
given  by  Aird  [1] . 

The  four  major  goals  of  this  problem  solving 
system  are: 

(1)  To  give  the  user  information  about  the 
behavior  of  his  function  in  a  region  vdiich  he 
specifies;  that  is,  at  a  set  of  points  uni- 
formly distributed  throughout  the  region,  to 
furnish  the  user  with  a  sampling  of  F  as  well 
as  with  certain  gradient  and  Hessian  infor- 
mation —  when  this  information  is  needed  and 
used  anyway  by  other  coiponents  of  the  prob- 
lem solving  system.    One  of  the  disadvantages 
of  many  existing  optimization  algorithms  is 
that  when  they  do  not  converge,  the  user  is 
left  with  little,  if  any,  information  about 
the  behavior  of  F  in  his  region  of  interest. 

(2)  To  give  the  user  assistance  froti  the  system 
in  choosing  good  starting  points.    Many  non- 
linear models  are  so  ccnplex  that  the 
scientist  has  little  advance  knowledge  of  the 
location  of  the  optimum  parameter  values. 
Even  in  cases  vAiere  the  scientist  has  (from 
physiccil,  biological,  etc.,  considerations)  a 


good  estimate  of  the  optimal  parameters,  say 
to  within  one  order  of  magnitude,  F  might  hav) 
a  number  of  maxima,  minima,  and  saddle-points' 
close  to  the  optimum  of  interest.     (A  saddle-' 
point  of  F  is  a  point  where  the  gradient  of  Fi 
is  zero,  but  F  is  neither  a  maximum  nor  a 
minimum. )    The  system  provides  the  user  with 
the  ability  to  specify  an  entire  closed  regio: 
which  he  believes  contains  the  minimum  in- 
stead of  forcing  him  to  specify  a  single 
starting  point.    All  too  often  a  single  , 
"rough"  starting  point  can  produce  divergence 
or  even  convergence  to  a  local  minimum  far 
from  the  minimum  of  interest.    Once  the  bound,: 
of  the  region  have  been  furnished  by  the  user 
as  input,  the  system  automatical ly  narrows  in 
on  a  smaller  but  highly  premising  region  in 
which  good  starting  points  exist.  i 

(3)  To  utilize  a  number  of  optimization  methods  s<' 
as  to  solve  those  problems  for  vfcLch  particu- 
lar optimization  methods  fail  to  find  a 
satisfactory  solution  or  perform  poorly.  For 
example,  v\*iereas  a  Levenberg-fferguardt  type 
of  method  (viiich  uses  Jacobian  matrix  infor- 
mation) might  be  well  suited  for  a  certain 
scientific  model  throughout  most  of  a  particu- 
lar region,  a  vastly  different  type  of  method, 
say  a  sirtplex  method  which  uses  no  partial 
derivative  information  (only  function  values) , 
might  be  more  appropriate  for  some  parts  of 
that  region. 

(4)  Tb  systematically  and  autanatically  find  the 
global  minimum  in  the  user's  region  of 
interest.     (By  "global  minimum"  in  this  paper, 
we  mean  the  smallest  of  the  local  minima 
values  in  the  user's  region  of  interest.) 
This  is  accarplished  in  two  ways.    First,  as  ; 
described  above  in  (2) ,  starting  guess  candi-^ 
dates  are  successively  narrowed  down  to  ones  ' 
most  likely  to  yield  the  global  minimum. 
Second,  even  after  successful  convergence  to 

a  minimum  from  one  or  more  of  the  starting 
points,  the  user  can  pre-specify  that  addi- 
tional promising  starting  points  should  be 
pursued  in  the  attenpt  to  find  the  global 
minimum. 


Vfork  performed  under  the  auspices  of  the  U.S.  Energy  Research  and  Development  Administration. 

C 

A  portion  of  the  work  of  this  author  was  supported  by  the  Office  of  Naval  Research  under 
Gr^t  N  00014-76-C-0329. 
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At  present,  the  algorithm  vdiich  inplements 
the  basic  philosophy  of  this  system  is  at  an  exper- 
imental level  and  should  not  be  mistaken  for  the 
final  version,  nor  should  the  FORTRAN  package  vAiich 
realizes  this  implsnentation  be  construed  as  final 
code. 

The  paper  is  organized  into  six  sections. 
Sections  II  and  III  describe  the  general  algorithm, 
section  IV  discusses  the  current  implementation  of 
the  general  algorithm,  section  V  presents  a  test 
problem  with  numerical  results  obtained  from  the 
FORTRAN  package  for  the  current  implenentation, 
and  section  VI  describes  future  research  plans. 


H  h 


ill.     DISCUSSION  OF  THE  ALGORITHM 


Figure  1 


The  structure  of  the  system  solver  was  modu- 
iVitlarized  into  four  basic  tasks  or  phases: 

(1)  Pre-optimal  analysis 

(2)  Starting  point  generation  and  selection 

(3)  Problem  solution 

(4)  Post-optimal  analysis 
4 

In  order  to  implement  the  concept  of  modulari- 
zation, a  control  program  was  constructed.  This 
program  gives  the  user  the  ability  to  select  one 

*  or  more  of  the  above  phases  with  which  to  attack 
his  problon.    Each  phase  can  be  accessed  alone  or 

^^as  part  of  a  collection  which  includes  other  phases. 

a 

iji 

I III.  DESCRIPTION  OF  THE  PHASES 

Phase  1.    The  phase  v\hich  we  call  "pre-optimal 
J. -analysis"  incorporates  the  ideas  of  (1)  pre- 
scaling  of  the  problan,  and  (2)  verification  and 
initialization  of  user-supplied  routines,  in  par- 
jj  ticular,  the  verification  of  user-furnished  exact 
analytic  partial  derivatives  by  means  of  finite- 
difference  techniques. 

^  Phase  2.    The  motivation  behind  phase  two  is  derived 
from  the  situation  a  user  faces  when  attacking  a 
!problCT>  about  which  he  has  very  little  information. 

,]  He  may  not  be  able  to  choose  a  starting  point 
sufficiently  close  to  the  minimum.    Perhaps  at  best 
he  can  supply  upper  and  lower  bounds  on  a  region 
in  which  he  suspects  a  minimum  to  exist.    In  order 

,  to  handle  situations  such  as  this,  a  point-disper- 
sion algorithm  can  be  used  to  generate  a  specified 

jj  number  of  points  v\hich  are  uniformly  distributed 
in  a  closed  rectangular  region  that  the  user  has 
defined.    Let  the  set  S  =  {sj  ,S2  , . . .  ,Sj^  }  denote 

;  tliese  points.  Now  evaluate  F  on  the  set  S  and  select 
.-  a  subset,  T  =  {tj  ,t2 , . . .  ,t^  }  c  S,  with  Ng      Nj , 

which  consists  of  the  ordered  points  for  viiich  F 
,   nas  ascending  values;  i.e. 


F{t^) 


F(t2)  <..._< 


Fl  2 
At  this  stage,  tj  appears  to  be  our  most  premising 
point  for  producing  a  minimum  of  F;  however,  as 
Figure  1  (for  a  one-dimensional  problem)  shows,  tj 
might  not  lie  in  the  valley  containing  the  minimum 
of  interest. 


All  too  many  problem  solving  systems  would  use  t^ 
as  a  starting  point  for  an  optimization  procedure 
which  would  then  converge  to  x.    Our  approach 
follows  that  of  Aird  [1],  namely,  from  each  of 
the  points  tj,t2,,..,t^  ,  we  use  our  best 

(fastest)  optimization  algorithm  for  a  small  number 
of  iterations  (we  have  used  three  for  this  number 
with  excellent  success) ,  producing  yet  a  third  set 
of  points,  P  =  {pi ,P2, . . . ,p^  }.    Again,  we  evaluate 

F  on  the  set  P  and  select  a  subset 

R  =  {ri,r2,...,r    }  c  p,  with  N,  <  Nj,  Vi^iich  con- 
N3 

sists  of  points  ordered  so  that  Fir^)  <  F(r2)  <...< 
F(r^,  ) .    The  points  in  R  can  then  be  used  succes- 

sively  as  "final  starting  points".    In  practice, 
the  restricted  set  R  has  yielded  excellent  starting 
points  for  use  in  phase  three.    In  the  figure  above, 
please  note  that  any  good  gradient  (descent)  pro- 
cedure would  result  in  rj  =  x*  being  used  as  the 
first  (most  prorasing)  starting  point  in  phase 
three. 

Phase  3.    The  methodology  of  phase  three  concen- 
trates on  problem  solution:    convergence  to  a 
minimum.    Two  major  considerations  have  arisen  in 
the  development  of  a  general  algorithm: 

(1)  What  strategy  can  be  used  to  blend 
existing  canplanentary  optimization 
techniques  in  such  a  way  as  to  achieve 

a  balance  between  efficiency  and  robust- 
ness?    (By  "efficiency"  we  mean  minimiza- 
tion of  speed  of  execution  and  core  stor- 
age requirements.    By  "robustness"  we 
mean  the  ability  of  a  method  to  converge 
from  a  very  wide  distribution  of  start- 
ing points  for  a  variety  of  functions . ) 

(2)  What  criteria  are  involved  in  discon- 
tinuing the  use  of  one  method  and 
initiating  the  use  of  another? 

Ideally,  we  would  like  to  use  one  method  that 
is  both  efficient  and  robust.    However,  highly 
efficient  methods  are  seldom  robust,  and  vice 
versa.    A  local  optimizer  is  classified  as  an 
efficient  method  and  is  one  which  converges  very 
rapidly  for  starting  points  that  are  close  to 
(local  to)  the  minimum.    An  example  of  a  local 
optimizer  is  the  Gauss-Newton  method  applied  to 
finding  the  minimum  of  quadratic  function  (from 
any  starting  point) .    In  contrast,  a  robust  method 
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will  usually  converge  to  a  solution  from  a  wide 
collection  of  starting  points,  although  the  con- 
vergence rate  is  generally  much  slower  than  when 
using  a  local  optimizer.    An  example  of  a  robust 
method  is  a  nonlinear  simplex  algorithm. 

These  considerations  led  to  the  development 
of  a  hierarchical  structure  in  vdiich  the  most  effi- 
cient method  is  at  the  top  of  the  hierarchy  while 
less  efficient  but  more  robust  methods  follow,  and 
finally,  the  most  robust  method  is  at  the  bottom  of 
the  hierarchy.    The  stracture  we  have  adopted 
utilizes  an  efficient  method  as  the  primary  method 
of  problem  solution.    However,  if  the  efficient 
method  fails  to  irake  reasonable  progress  towards 
the  minimum,  alternative  methods  which  are  less 
efficient  but  more  robust  and  which  have  corrpletely 
different  structural  properties  should  be  available. 
For  exairple,  it  makes  little  sense  to  switch  back 
and  forth  between  local  methods  such  as  Gauss- 
Newton,  Levenberg-Marguardt ,  and  Steepest  Descent 
if  the  Levenberg-Marquardt  algorithm  is  not  con- 
verging properly,  since  the  Levenberg-Marquardt 
algorithm  is  already  a  coipranise  between  the  other 
two.    Instead,  one  should  try  a  method  having  a 
canpletely  different  structure,  perhaps,  in  this 
case,  a  Quasi-Newton  method  or  one  of  the  better 
heuristic  search  procedures. 

In  our  implementation  the  local  method  is 
used  and  reused  first,  and  only  vdien  the  local 
method  fails  to  make  progress  do  we  switch  to  a 
more  robust  method. 

Phase  4.    In  phase  four,  which  we  call  "post- 
optimal  analysis",  it  is  verified  whether  or  not 
the  solution  obtained  in  phase  three  is  indeed 
the  minimum,  as  opposed  to,  say,  a  saddle-point. 

IV.     CURRENT  IMPLEMEJSrmTION 

The  current  inplementation  of  the  system 
solver  is  described  below  with  the  aid  of  the 
flow  diagrams  shown  in  Figures  2  through  5. 

The  control  program  (see  Figure  2)  directs 
the  flow  of  control  of  the  system.    This  is 
accomplished  by  two  input  parameters,  INOPT  and 
OUTOPT,  each  of  v\4iich  takes  on  a  value  from  one 
to  four.    The  value  of  INOPT  specifies  which 
phase  the  user  wants  to  access  first,  vsiiile  the 
value  of  OUTOPT  defines  the  last  phase  he  wants 
to  access.    For  example,  if  the  user  requested 
execution  of  all  four  phases,  INOPT  would  be  set 
to  one,  and  OUTOPT  would  be  set  to  four.  However, 
if  he  only  wanted  to  generate  starting  points,  both 
INOPT  and  OUTOPT  would  be  set  to  the  value  of  two. 
The  control  program  also  enables  the  user  to 
specify  more  than  one  starting  point  when  attenpt- 
ing  to  find  the  minimum.     (These  points  will  be 
used  on  successive  atteitpts  by  phase  three.)  The 
ability  to  use  more  than  one  starting  point  has  a 
twofold  advantage: 

(1)  It  increases  the  possibility  of  converg- 
ing to  a  global  minimim  fron  at  least 
one  starting  point. 

(2)  If  the  user's  problan  is  very  ccnplex 
and  it  is  difficult  to  find  even  a 
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local  rninimum,  the  possibility  of  con- 
verging to  a  local  minimum  increases  vdien 
using  wore  than  one  starting  point. 

At  present,  phase  one  has  not  been  implemented. 
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The  construction  of  phase  two  (see  Figure  3) 
-S  scsnev^iat  more  detailed  than  its  description 
jiven  above  in  section  III,  but  it  parallels  the 
general  algorithm.    Nj ,  N2,  and  N3  are  user- 
jupplied  parameters.    The  methods  currently  used 
ji  phase  two  are  the  Aird  and  Rice  point-disper- 
;ion  algorithm  [2],  and  Brown's  derivative- free 
odification  of  the  Levenberg-Marcjuardt  method  [3]. 

The  current  itrpleraentation  of  phase  three 
see  Figure  4)  utilizes  a  local  optimizer  as  the 
arimary  method  for  solution  and  an  alternative 
ethod,  a  nonlinear  simplex  algorithm.    The  local 
ethod  is  used  initially  for  a  specified  number  of 
-terations.    When  the  method  does  not  converge  to 
I  minimum,  the  progress  of  the  method,  is  evaluated. 
:f  the  results  indicate  that  it  is  performing  well 
aiDugh  to  continue,  the  flow  of  control  returns  to 
■he  local  optimizer.    If  the  progress  evaluation 
ndicates  that  the  method  is  performing  poorly, 
lie  nonlinear  simplex  method  is  anployed.    If  this 
lethod  does  not  converge  to  a  minimum  in  a  specified 
lumber  of  iterations,  an  evaluation  of  its 
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performance  is  necessary.    The  flow  of  control  re- 
turns to  the  local  optimizer  if  the  results  indi- 
cate a  significant  degree  of  progress.    However,  if 
the  simplex  method  has  performed  unsatisfactorily, 
the  flow  of.  control  returns  to  the  control  program 
with  an  indication  that  a  solution  could  not  be 
found.    At  this  point,  the  next  most  promising 
starting  point  candidate  is  used  and  phase  three 
is  re-entered.    Successful  progress  of  both  opti- 
mizers is  dependent  upon  a  significant  decrease  in 
the  F  value  or  in  the  norm  of  the  gradient.  The 
local  methDd""currently  used  is  Brown's  method  [3]. 
The  nonlinear  simplex  algorithm  used  is  Parkinson's 
modified  version  of  Nelder  and  Mead's  nonlinear 
sirrplex  algorithm  [4,5]. 

Post-optimal  analysis  in  phase  four  (see 
Figure  5)  is  accomplished  by  examining  points  in  a 
neighborhood  of  the  solution  obtained  in  phase 
three  and  then  ccmparing  the  F  values  of  these 
points  against  the  F  value  at  the  solution  point. 

V.      A  TEST  PROBLEM  AND  I-IUMERICAL  RESULTS 

A  FORTRAN  program  based  upon  the  current  im- 
plementation of  the  problem  solving  system  has 
been  run  successfully  on  a  collection  of  standard 
test  problems.     (Those  results  will  be  documented 
elsewhere.)    For  the  purposes  of  this  discussion, 
we  have  constructed  a  new  test  problem  whose 
gecmetry  would  challenge  the  ability  of  the  system 
to  find  a  global  minimum. 
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The  test  problem  is  given  by 
minimize  F(x,y)  where 
F 


0001  *  (f^^  +  t^) , 


and 


f-,^  =  200.  -  175.  *  [exp{-(x  -  17.)  ) 

+  exp(-(y  -  17.)^)], 

=  5.  *  [ (x  -  12.)^  *  (X  -  23.) 

2 


*  (X  -  17.)  +  (y  -  12.) 

*  (y  -  23.)  *  (y  -  17.)]. 

Note  that  F  is  symmetric  in  x  and  y. 

A  three-dimensional  plot  of  F  over  the  range 
of  interest  is  given  in  Figure  8.    In  order  to 
better  exhibit  the  topography  of  F  around  the  glo- 
bal minim,  a  contour  plot  is  shown  in  Figure  9. 

The  reason  that  F  is  challenging,  especially 
for  local  optimization  methods,  is  that  the  region, 
X  =  (11,24)  and  y  =  (11,24) ,  contains  a  number  of 
local  minima  and  saddle-points,  including  a  saddle- 
point  close  to  the  global  minima.  Specifically, 
the  points  given  (approximately)  by  (12,12), 

(12,23),   (23,12),  and  (23,23)  are  all  local  minima 
of  F  and  at  those  points  the  value  of  F  is  4. 
Similarly,  the  points  given  (approximately)  by 

(12,17),   (17,12),   (17,23),  and  (23,17)  are  also 
local  minima  of  F;  at  each  of  these  points  F  takes 
on  the  value  of  .0625.    The  troublescme  saddle- 
point  occurs  at  (17,17)  where  F  assumes  the  value 
of  2.25;  this  saddle-point  is  near  the  global 
minima  which  are  given  (approximately)  by 


(17.61014224,16.110990468)  and  the  symrtetrically 
placed  point,   (16.110990468,17.61014224).    At  the 
global  minima  F  has  the  value  of  zero. 

An  observation  to  be  noted  is  that  even  finer 
contour  plots  than  the  one  given  in  Figure  9 
failed  to  expose  the  locations  of  the  global  mini:! 
Graphical  techniques  have  merit  in  showing  the 
overall  general  behavior  of  a  function,  but,  as 
this  test  function  indicates,  graphics  cannot  be 
relied  upon  to  solve  optimization  problans. 

Experiment  #1.    In  order  to  obtain  a  measure 
of  the  robustness  of  the  system  solver  vs.  the 
robustness  of  the  stand-alone  methods  of  v\iiich  it 
is  composed,  we  ran  (a)  the  systan  solver,   (b)  th(i 
local  optimizer  (Brown's  method  [3]),  and  (c)  the 
sirrplex  method  [4,5]  frcm  first  10,  then  20,  and 
finally  40  starting  points  uniformly  distributed 
in  the  region  of  interest,  x  =  (1,31)  and 
y  =  (1,31) .    The  results  are  summarized  in  the 
table  given  in  Figure  6 . 
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The  word  "success"  in  that  table  means  that  the 
method  converged  to  a  local  minimum  (which  in  sok, 
cases  corresponded  to  a  global  minimum) ,  whereas 
the  word  "failure"  means  that  the  method  converge 
to  a  saddle-point,  diverged ,  or  failed  to  converg(j, 
in  the  maximum  number  of  function  evaluations  or  | 
maximum  number  of  iterations  that  were  allowed. 

Experiment  #2.    The  purpose  of  this  experima 
was  to  see  how  many  points  (Nj)  had  to  be  uniform 
scattered  throughout  the  region  in  order  for  the 
system  solver  to  produce  a  global  minimum  as  dis- 
tinct fron  a  local  minimum.    As  Figure  7  indicate; 


O-f  mri'vu/n 

s 

5 

1 

/o 

5 

1 

y. 

zo 

5 

1 

o. 

6 

1 

0. 

Fig\:ire  7 


314 


3-D  GRAPH  OF  THE  TEST  FUNCTION 
X  =  (8,24),  y  =  (8,24) 

Figure  8 
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CONTOUR  PLOT  OF  THE  TEST  FUNCTION 
X=  [10.8,17.8],  y=  [10.8,17.8] 
CONTOUR  INTERVAL  =2.00  UNITS 


Figure  9 
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as  soon  as  20  points  (or  more)  are  initially 
scattered,  the  system  solver  converges  to  a  global 
Tonima.    In  order  to  illustrate  this  experiinent  in 
greater  detail,  let  us  consider  the  case  in  \fthich 
Ml  =  20,  N2  =  5,  and  N3  =  1  in  Figure  7  (see  also 
t±ie  discussion  of  phase  two  in  Section  III  of  this 
paper) .    In  phase  two  of  this  algorithm,  an  input 
of  Nj  =  20,  causes  20  points  to  be  uniformly 
scattered  by  the  Aird-Rice  algorithm  [2]  in  the 
region  of  interest  v\hich  is  x  =  (1,31)  and 
y  =  (1,31) .    Ihe  location  of  these  points  is 
shown  in  Figure  10.    The  numbering  of  the  points 
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given  in  Figure  10  corresponds  to  the  order  in 
,vdiich  they  were  produced  by  the  Aird-Rice  Method 
[2].    Phase  two  now  evaluates  F  at  each  of  these  20 
points.    Ihe  input  parameter  value,  N2  =  5  (see 
Figure  9)  now  causes  phase  two  to  take  the  5  points 
(of  the  original  20)  having  the  smallest  F  value 
iand  perform  3  iterations  of  the  local  method  from 
each  of  these  5  points.    Once  again,  the  F  values 
of  the  5  resulting  points  are  confuted.  Finally, 
the  input,  N3  =  l,   (see  Figure  7)  causes  the  selec- 
tion of  the  one  point  which  has  thus  far  produced 
the  smallest  F  value  to  be  used  as  THE  starting 
point  for  phase  three.    That  point  caused  phase 
three  to  converge  to  the  global  minimum  which  had 
the  value  of  zero.    It  is  of  interest  that  the 
point  (19,10)  labeled  "6"  in  Figure  10  had  the 
smallest  F  value  of  the  original  Nj  =  20  scattered 
points,  but  if  that  point  had  been  used  directly 
as  the  starting  point  for  phase  three,  the  system 
solver  would  have  converged  to  the  local  minimum  of 
F  at  (12,12)  with  the  corresponding  F  value  of  4. 
On  the  other  hand,  the  point  (19,16)  labeled  "3"  in 
Figure  10  only  ranked  third  on  the  list  of  the  5 
best  points,  based  upon  F  values  only;  however, 
when  that  point  benefited  from  being  run  through 
the  3  iterations  of  the  local  method,  the  resulting 
point  had  the  smallest  F  value  and  was  used  in 
phase  three,  v*iich  then  produced  the  global  minimum. 


(17.61014224,16.110990468)  with  a  corresponding  F 
value  of  zero. 

In  summary,  the  system  solver  displayed  a  high 
success  rate  (see  Figure  6)  and,  once  enough  points 
were  used  in  the  initial  scattering  by  phase  two 
(i.e.,  once       was  large  enough) ,  the  system  solver 
found  a  global  minimum  (see  Figure  7) . 

VI.     FUTURE  RESEARCH  AND  PLANS  FOR  THE  FIRST 
DISTRIBUTED  CODE 

The  long  range  plans  for  the  problen  solving 
system  include: 

(1)  The  ability  to  solve  nonlinear  uncon- 
strained optimization  problems  and  non- 
linear systems  of  equation  problems  in 
addition  to  the  current  ability  of  solv- 
ing nonlinear  least  squares  problems. 

(2)  The  ability  to  implement  any  of  a  number 
of  hierarchical  structures  in  phase 
three.    Initially,  we  shall  explore  a 
three-tiered  structure  in  which  the  top 
level  (the  level  which  is  used  and  re- 
used first)  includes  methods  which  re- 
quire Hessian  information.    The  second 
level  of  methods  utilize  gradient  infor- 
mation and  the  bottom  level  methods 
utilize  only  function  values.  Typically, 
the  methods  at  the  top  level  are  the 
most  efficient  whereas  the  methods  at  the 
bottom  level  are  the  most  robust. 

(3)  The  ability  to  plug  in  any  local  method 
or  robust  method  into  the  appropriate 
boxes  in  phase  three  without  the  user 
having  to  reprogram  any  of  the  rest  of 
the  problem  solving  system.  Similar 
abilities  apply  to  phases  two  and  four. 

(4)  The  development  of  a  control  language  to 
allow  the  user  to  make  requests  of  the 
system  in  sinnple  meaningful  canmand 
statements . 

(5)  The  creation  of  a  fully  modularized 
package . 

In  order  to  achieve  one  goal  of  producing  and 
distributing  useful  code  by  the  end  of  1977,  we 
shall  concentrate  on  a  specific  problem  area,  the 
unconstrained  minimization  of  a  sum  of  squares  of 
nonlinear  functions.    We  have  adopted  the  follow- 
ing inplementation  strategy: 

Phase  1.    As  we  plan  to  allow  derivative  or 
derivative-free  methods  in  phase  three,  we  shall 
implement  code  to,  at  the  user's  option,  verify  the 
correctness  of  the  partial  derivatives  which  he 
furnishes.    Again,  at  the  user's  option,  the  sys- 
tem will  automatically  provide  default  values  for 
the  required  input  parameters.    Finally,  a  scaling 
strategy  involving  the  diagonal  of  the  inverse 
Hessian  matrix  of  F  will  be  tested  and  implonented 
if  successful. 

Phase  2.    This  phase  will  rariain  as  it  is 
now,  utilizing  the  Aird-Rice  point-dispersion 
algorithm  [2]  followed  by  three  iterations  of 
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Brawn's  method  [3]. 

Phase  3.    We  shall  adopt  the  three-tiered 
hierarchy ,  using  Brown's  Levenberg-Marquardt  type 
of  method  [3]  at  the  top  of  the  hierarchy,  the 
method  of  descent  (searching  in  the  direction  of 
the  negative  of  the  gradient)  at  the  second  level 
and  again  Parkinson's  nonlinear  simplex  method 
[4,5]  at  the  bottom  level.    The  flow  of  control 
vAiich  v«  shall  inplement  in  phase  three  is  given 
in  Figure  11. 
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Note.    The  reason  that  Brcwn's  method  [3]  andl 
the  method  of  descent  transfer  control  to  the 
sirrplex  method  upon  successful  convergence  is  to 
avoid  the  difficulties  associated  with  saddle- 
points.    Our  numerical  experiments  have  indicated 
that  gradient  and  Levenberg-Marquardt  type  of 
methods  will  readily  converge  to  a  saddle-point; 
hcwever,  a  few  iterations  of  the  simplex  method 
are  sufficient  to  move  away  from  such  a  point. 
Obviously,  there  will  be  sufficient  tests  made  to 
avoid  infinite  looping  (upon  successful  convergence 
in  phase  three. 

Phase  4.    We  shall  cortpare  an  Aird-Rice  [2] 
scattering  approach  with  a  cyclic  coordinate  seard 
in  which  the  coordinate  axes  have  been  rotated  to 
correspond  to  information  obtained  from  the  statist 
tics  of  the  nonlinear  regression  fit.    The  best  of' 
these  approaches  will  be  used  in  phase  four. 
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Alistract 

The  purpose  of  this  paper  is  to  present  a 
description  of  a  system  of  subroutines  to  compute 
solutions  of  the  iteratively  reweighted  least 
squares  problem  where  the  weights  themselves  are 
functions  of  the  scaled  residuals.  Starting 
points  for  the  iterations  are  the  ordinary  least 
squares  solution,  the  overdetermined  solution  in 
the        norm,  or  starting  points  specified  by  the 
user , 

Introduction 

The  iteratively  reweighted  least  squares 
algorithms  are  a  part  of  robust  regression  where 
"robustness"  is  used  in  the  statistical  sense  of 
relative  insensitivity  to  moderate  departures 
from  assumptions.     The  experimental  system  of 

: subroutines  that  is  used  to  compute  the  solution 
to  the  iteratively  reweighted  least  squares 
problem  is  modular  mathematical  software  written 
as  a  collection  of  Fortran  subroutines.  The 
subroutines  are  designed  to  operate  efficiently 
and  reliably  on  computing  machines  of  the  major 
manufacturers.     The  specific  machines  to  which 

,we  refer  are  CDC  660O/76OO,  Honeywell  6OOO, 
IBM  360/370,  PDF  10,  and  Univac  IIO8. 

I         The  software  for  solving  iteratively 
I reweighted  least  squares  problems  represents 
f interdisciplinary  research  in  numerical  analysis, 
robust  statistics  and  quality  software  and,  as 
such,  represents  the  combined  work  of  many 
people.     The  basic  design  of  the  iteratively 
reweighted  least  squares  algorithm  and  the 
computation  of  the  default  ''tuning  constants'' 
for  the  various  weight  functions  was  done  by 
Paul  Holland.     The  convergence  criterion  for 
iteratively  reweighted  least  squares  was 
devised  by  John  Dennis.     The        start  from 
the  overdetermined  solution  in  the        norm  was 
provided  by  Richard  Bartels.       The  subroutines 
\  for  the  software  for  the  stem  and  leaf  display 
were  done  by  David  Hoaglin  and  Stan  Wasserman. 
The  design  of  the  interactive  driver  program  was 
based  on  the  advice  of  David  Hoaglin  and  Roy  Welsch. 
Technical  and  programming  assitance  was  provided 
'by  David  Coleman,  Neil  Kaden,  and  Sandra  Moriarty. 
Valuable  discussions  continue  to  be  held  with 

*This  work  was  supported  in  part  by  the  National 
Science  Foundation  under  Grant  No.  MCS76-II989. 


David  Gay  and  Richard  Hill. 

The  substantial  contributions  of  Gene  Golub  have 
been  central  to  this  work.     His  encouragement  to  us, 
his  constructive  work  on  the  numerical  stability 
of  the  algorithms,  and  his  continuing  exposition 
that  crosses  the  boundary  of  robust  statistics  and 
numerical  algebra  constitute  an  essential  resource. 

Section  1 

The  method  of  least  squares  is  versatile  and 
numerically  stable  when  computationally  stable 
methods  are  used,  i.e.   [1,1+].    Nonetheless,  least 
squares  does  not  give  very  much  information  about 
outliers  or  leverage  points  if  one  looks  simply  at 
the  coefficients    x    of    b  =  Ax  +  e.     In  the 
notation    b  =  Ax  +  e ,    b    is  an    mxl    vector  of 
observations,    A    is  an    mxn    data  or  design  matrix, 
X    is  an    nxl    vector  of  parameters,     e     is  an  mxl 
vector.    We  recognize  that  the  usual  statistical 
notation  is  y  =  X3  +  e    where    y    is    nxl,    X  is 
nxp ,    B    is    pxl,    and    e    is    nxl.     In  the 
iteratively  reweighted  least  squares  subroutines 
the  software  for  least  squares  model  fitting 
technology  has  been  extended  to  provide  more 
information  about  the  data  and  to  provide  a  vehicle 
for  handling  large  residual  problems. 

The  ordinary  least  squares  problem  is 


where 


r  is  the  residuals,     b-Ax ,     and    s     is  a  scale. 

The  weighted  least  squares  problem  is 

m             /  r .    ( x )  \^ 
min      Z      W.    I   1 

which  is  solved  by  using  ordinary  least  squares 

1/2  1/2 
with    W       A    and    W      b.    W    is  a  diagonal  matrix 

of  weights  that  aire  functions  of  scaled  residuals. 

The  iteratively  reweighted  least  squares 
problem  assumes  a  start.     Presently  the  L2  start 
can  be  computed  from  MINFIT  from  EISPACK  II 
followed  by  MINSOL  which  determines  the  best 
approximate  rank  of    A    and  computes  the  least 
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squares  solution.     Alternatively  the        start  can 
be  computed  from  the  subroutines  QBF,  an  orthogonal 
decomposition  based  on  Householder  tranf ormations  , 
followed  by  QBSOL  which  solves    Ax  =  b    from  the 
output  of  QRF.    An  L]_  start  can  be  obtained  from 
the  subroutines  L^,  by  Richard  Bartels,  which 
compute  the  overdetermined  solution  in  the  norm. 
Given  the  starting  solution,  the  scale  is  deter- 
mined from  subroutine  SMAD  to  get  the  median 


r . 

1 


.671*1*9  '    ^i  ^  °- 


The  scaled  residuals  are 


formed  by  subroutine  SCIMAD.     The  weighting  matrix 
1/2 

W  is  determined  from  any  one  of  the  eight  sub- 
routines that  compute  the  weight  functions.  Thus, 

given    X^*^^     from        or  L-j^,  the  problem  is  iterated 

to  obtain    X^^*'^''  =  (A'^W^'^^A)'''W^^^b    using  MINFIT 
or  QR  factorizations. 

To  test  convergence,  after  the    k^'^  iteration, 
we  compute 


1/2 


where     | | * | |     is  the  Euclidean  norm. 
Subroutine  WGRADl  computes  the  gradient  and  sub- 
routine WGRAD2  computes  a  scale  independent  measure 
of  the  gradient. 

Section  2 

The  term  "robustness"  has  a  common  thread  of 
meaning  that  carries  through  statistical  robustness, 
computational  stability,  and  reliable  software. 
From  the  standpoint  of  reliable  software,  moderate 
departures  from  assumptions  means  that  the  per- 
formance of  the  software  shall  be  unaffected  (in 
the  sense  that  performance  will  not  be  degraded) 
by  the  environment  in  which  the  software  is  run, 
the  compiler  from  which  code  is  generated,  or 
the  applications  system  in  which  the  software  is 
imbedded.     In  particular,  we  program  to  avoid 
abnormal  system  interruption  or  termination. 
Cody  [2]  has  given  an  excellent  exposition  of 
reliable  software  and  has  provided  more  details 
in  his  position  paper  for  the  workshop  on  robust 
software.  Computer  Science  and  Statistics, 
Ninth  Annual  Symposium  on  the  Interface. 

Throughout  the  work  on  iteratively  reweighted 
least  squares  heavy  emphasis  has  been  placed  on 
modular  subroutines.     For  example  the  convergence 
criterion  can  be  changed,  weight  functions  can  be 
added,  and  numerical  equilibration  (for  columns 
of  the  A  matrix)  can  be  invoked.     Optionally  the 
"hat"  matrix  which  is  the  projection  matrix. 


A(A^A)^  A^, 


is  obtained  as    UU    rpf^oni         singular  value 
decomposition  or    QQ      from  Householder  trans- 
formations.    If  desired,  the  stem-and-leaf  display 
of  the  residuals  is  provided.    An  interactive 


driver  program  is  used  to  print  the  singular  value- 
the  L2  condition  number,  select  one  or  more  weight- 
ing functions,  display  residuals,  monitor  conver- 
gence, and  optionally  select  the  display  of  the 
diagonal  or  the  upper  triangle  of  the  "hat"  matrix 
and  the  histogram  for  the  stem-and-leaf. 

Our  approach  to  programming  design  included 
documentation  for  use,  and  flow  of  program  control^ 
as  comments  in  the  program.     We  use  a  declaration 
checker  to  be  sure  that  all  variables  have  been 
declared.    The  Fortran  verifier,  PFORT,  from  Bell 
Telephone  Laboratories,  was  used  to  check  the  sub- 
routines, and  the  Fortran  converter,  from  IMSL, 
was  used  to  generate  the  Fortran  code  for  non-IBM 
machines . 

The  explicit  weight  functions  that  we  have 
used  are  listed  in  the  Appendix  I,  Table  1.  The 
subroutines ,  with  the  exception  of  those  required 
for  the        start,  are  listed  in  Table  2.  Typical 
convergence  quantities  for  data  from  [3]  are 
listed  in  Table  3.    Appendix  II  shows  a  sample 
experimental  program  for  one  of  the  weight  function 
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Appendix  I 


Table  1 


Examples  of  weight  functions  (where  u  =  scaled 
residual)  and  the  default  tuning  constant  for 
each  weight  function. 


ANDREWS        w, (u)     =<       ,  . 

^0 

A  =  1.339 


BIWEIGHT      w  (u)  =< 


u  <  ttA 


u  >  ttA 


^0 

B  =  I+.685 


u  <  B 


u  >  B 
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UCHY 


w^(u)  = 


c  =  2.385 


1+ 


F  =  l.ltOO 


H 

FT 


u    <  H 


u    >  H 


H  =  1.31+5 


tanW 


jGISTIC      ^^(u)  = 


DLSCH 


L  =  1.205 


iLWAB  w^(u) 


u    <  T 


u    >  T 


T  =  2.795 


wj,(u) 


R  =  2.985 


X.0 
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NAME 
**** 


DESCRIPTION 
*********** 


EQOl  MODIFIED  ROVJ-INF-EQUILIBRATION 

EQ02  COLUMN    (MAX.    ELEMENT)  EQUILIBRATION 

EQ03  ROW    (MAX.    ELEMENT)  EQUILIBRATION 

EQ04  COLUMN    (SQRT.    SUM  OF  SQUARES)  EQUILIBRATION 

EQ05  ROW    (SQRT.    SUM  OF  SQUARES)  EQUILIBRATION 

EUNORM  EUCLIDIAN    (SQRT.    SUM  OF  SQUARES)  NORM 

HMAT  FORMS   DIAGONAL  OF  H-MATRIX  (U*U-TRANS) 

HMATQR  FORMS   DIAGONAL  OF  H-MATRIX  (Q*Q-TRANS) 

ISORTl  SHELL  SORT    (DECREASING)    USING  INDIRECTION 

IS0RT2  SHELL  SORT    (INCREASING)    USING  INDIRECTION 

MINFIT  SINGULAR  VALUE  DECOMPOSITION  A=U*SIGMA*V-TRANSP 

MINSOL  SOLVES  AX=B  GIVEN  OUTPUT  FROM  MINFIT 

QRF  QR  DECOMPOSITION,    Q  ORTHOGONAL  TRANSFORMATIONS 

QRSOL  SOLVES  AX=B  USING  QRF 

RESIDE  COMPUTES  REDISUAL  B-AX 

SCLMAD  SCALE  RESIDUALS  BY  SCALING  FACTOR 

SLDSPY  DOES   STEM  AND  LEAF  DISPLAY    (CALLS  OTHERS) 

SILEAF  DETERMINES   STEMS  AND  LEAVES 

SLPRNT  PRINTS   STEM  AND  LEAF  DISPLAY 

SLSCAL  DETERMINES   SCALE  FACTOR  AND  UNIT  FOR  DISPLAY 

SLSCRT  SHELL  SORT  IN   INCREASING  ORDER 

SMAD  DETERMINES  MAD  SCALING  FACTOR 

WANDRW  ANDREWS  WEIGHTING  FUNCTION 

WBIWGT  BIWEIGHT    (BISQUARE)    WEIGHTING  FUNCTION 

WCAUCH  CAUCHY  WEIGHTING  FUNCTION 

WELSCH  WELSCH  WEIGHTING  FUNCTION 

WFAIR  FAIR  WEIGHTING  FUNCTION 

WGRADl  COMPUTES  GRADIENT 

WGRAD2  COMPUTES   SCALE   INDEPENDANT  MEASURE  OF  GRADIENT 

WHUBER  HUBER  WEIGHTING  FUNCTION 

WLOGIS  LOGISTIC  WEIGHTING  FUNCTION 

WTALWR  TALWAR    (ZERO-ONE)    WEIGHTING  FUNCTION 
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Table  3 


The  data  from  [3],  suggested  by  Paul  Holland 
as  test  data  is  a  25  x  10  data  matrix    A  with 


il 


1.0. 


The  Lg  condition  number  is 


\ax  _  -396  X  10' 


The  maximum  diagonal 
"min       .789  x  lo"'^ 

element  of  the    H,     "hat"  matrix  is  .85-  Based 
on  functions  of  the  scaled  residuals,  the  effect 
of  the  weight  functions  is  to  down  weight  some  of 
the  observations. 

For  the  weight  functions  listed  below  the 
maximum  element  in  magnitude  of  the  scale-free 
measure  of  the  gradient  after  iteration  1  and  after 
iteration  10  is  as  follows. 


after  iter  1    after  iter  10 


Andrews  start 
start 

.383 
.10i+ 

.21+1+  X  10 

-It 

.177  X  10 

Biweight  start 
start 

.383 
.105 

.21+5  X  lO"''' 
.181+  X  10~^ 

Huber  start 
start 

.383 

.89U  X  10""^ 

Q 

.609  X  10 
.1+35  X  10"^ 
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r.%  Fl^f    TF     (HIS)    THFM  BIW01?7(i 

CC               HOMFYMFLL:   OFLIm  =   (  ? .  **  1  ?  7  )  *  (  1  .   -   ?.**-77)      ::::::::::  BTWUPRO 

.r.f,  FI.SF    IF     (HEC)    THFM  BIWii;>q(i 

CC               POP   10:   IJFLIM  =    (  ? .  **!  ?7  )  =!M  1  .   -  ?.i=*-?7)      ::::::::::  BIwoi3()(i 

C<  FI_SF    TF     (CnC)    THFM  BIW0131(1 

CC                COMTROL   i~)ATfi:   OFLIM  =    (?.=:=;  1 0               **4  R  -   1.)      ::::::::-::  BIW0132n 

CR  FI.SF    IF    (BGH)    THFM  RIW0133(i 

CC               BllRRniiCHS  :   OFLIM  =    (  «  .          )    (  R  .     1  ^  -   I.)      ::::::::::  Biw;)i340 

Ct  FLSF,    1    CARO  BIW(n3Sn 

CC  *^=;:*  ******      OATft    STATFMFMT      **********  8TWr)1360 

.  C  f  n  A  T  &   n  F  L  I  M    /  S  I  M  F  P  /  B  T  W  1 1 1  3  7  fi 

0ATAnFLTM/77FFFFFFFFFFFFFFF/  BTWi)13«n 

C  BIWf'13Q0 

C  ::::::::::     iiffta    IS  Thf   Smaiifst   prisirivF  floatimg  point  mmmhfr  biwoi'+oo 

C                 S.T.   LifBta   amo  -offta  r.n<   i-OTh  pf   pfprfsfntfo.  BTWOI^IO 

C«  if    (IBM)    THFM  BIW01420 

CC               IBM   360/3  70:   IIFFTA   =   lis  .**-6S      ::::::::::  BIW0143  0 

Cf,  FI.SF     IF     (XEROX)     THFM  RII;Jf)144(i 

CC               XFROX:   UFFTA   =   l^.*i,-(SS      ::::::::::  BIW0145(> 

Ci  FI.SF     IF     (IIMTUAC)     PHFM  BIW0146n 

CC               iiNIVAC:   IIFFTA   =  ?.*=;-12<-     ::::::::::  BIwni47o 

Cf  FI.SF     IF     (HIS)     THFM  BIW014Bn 

CC  HOMFYWFLL:   OFFTA   =    (?.       }?R  )*(?.**- 1   +  ?.**-?7)      ::::::::::  Riwiil4Qn 

Cl  Fl,  "sF    IF    (OEC)     THEM  B I  wo  15  on 

CC               POP   10:   IIFFTA   =  ?.*-:-12<-      ::::::::::  Riwrasio 

■  Ct;  FLSF    IF    (CDC)    THFM  BJWOlS^O 

CC               COMTROL   OATA:   IIFFTA   =  2.*=;-Q7S     ::::::::::  BIW01530 

Ct  FI.SF    IF    (BGH)     THFM  BIW01S40 

CC               RIIRROOCHS:   OFFTA   =  f-.**-si      ::::::::::  Biwois^o 

Ct  FI_SF,    1    CARO  811.101560 

CC  **********      nATA    STATFMFMT      **********  BlwnlS7(' 

C$  DATA    IIFFTA    /SFTA/  BTWHISRO 

DATA  IIFFTA    /  7  00 1  noooonoooof  Cf  /  BIWO15fl0 

C  BTwni<soo 

C  R  I  W  O  1  f,  1  0 

C  *****RnnY    OF    PROGRAM:  BIW!U620 

IF    (COMST    .LF.    l.Ono)    PROn    =   ('FLTM    =;<   CONST  BIW0163ri 

IF    (CONST    .GT.    l.ono)    PROO    =   <IFFTA    *   CONST  BIW01640 
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F  T  LF  :   R  Twr,T 


CDRMFi.l.    \/lv|/-i7n  ^.A 


nn  ion  T  =  i,i\i 

1 1 1    =   OA      (    1 1  (  T  )  ) 

TF   (di  .LF.  r,nM<;T)  r-n  ti'  ir 

::::::::::     OA  c«s  ( 1 1  (  I  )  )    .  fO..    COMSr  :::::::::: 
c;oui  (  T  )   =   n  .nnn 
an  Td  1  no 
in  rnMTTMiiE 

TF   (cnwsr  .0.  r.   i.ono)     an  tii  ?o 

TF    (111    .LF.    PROD)       (-11  -^c 

::::::::::     DIVT'^TriM  wniiin  fi\/F«Fi_rv,i  :::::::::: 
<;on  (  T  )   =  o  ,nno 
cfi  TO   1  o-i 

?o  rriMTIMiiF 

TF   (COMSr  .IF.   i.ono)     r,ri  iri  ^o 
IF   (Ml    .r,F.   ppnn)     r-n  if'  ?r 

::::::::::     otx/T'^ium  wniiin  iiMnFRFi.nw  :::::::::: 
SOW ( I )   =     1 .ono 
r,n  Tn  1  no 
^0  roNTTMiiE 

::::::::::  FiiMf.TTnM  caw  hf  r.nMpiirFii  MORMfli.i.Y  ; 
III  =  ii(  t  )    /  cnM^  r 

sow  (T)  =  ((o.sno  +  HI)  +  r..sno)=^-{(i.sno  -  ni)  -t- 

1  on   CHMT  I  MllF 


0. son  ) 


R  FT  I  IR  M 
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An  E  xpe  r  i  inen  ta  1    Interactive  SystP^  ^or   Intepier  Prop;rarnininp; 


Monlque  Guii^nard 
VJharton  School 
University  of  Pennsylvania 

Kurt  Spiel  her.? 
IBfl  Scientific  Marketinir 


Ahs  t  rac  t 

The  paper  describes  an  experimental 
interactive  system  for  the  solution  o^ 
integer  programming  [)roblems  (primarily  of 
0-1  nature,   but  not  exclusively  so) 

The  system  includes  most  techniques  which 
appear  to  offer  hope  of  overcoming  the 
combinatorial  difficulties  of  all  integer 
programming  algorittims:  in  particular  LP 
with  cuts,  BB  with  propagation.  State 
enumeration.  Interval  reduction.  Preferred 
variable  reduction.  Exploitation  of 
Benders  inequalities.  Local  search, 
Heur  istics,  etc.. 

There  is  emphasis  on  interaction  by  the 
analyst  at  the  terminal.  An  example 
illustrates  the  use  of  various  tools.  A 
table  summarizes  results  obtained  by  a 
number  of  different  approaches  to  a 
problem  of  moderate  difficulty. 

1.  Introduction 


We  describe  an  experimental 
interactive  system  for  0-1  programming. 
The  programs  are  written  in  APL. 
Various  techniques  have  been  or  are 
planned  to  be  incorporated  in  this 
system.  They  are  intended  to  be  called 
by  the  user,  independently  from  each 
other  (subject  to  restrictions  which  we 
would  1  i  l<e  to  make  as  little  painful  as 
possible)  ,  or  sequentially,  depending 
on  the   results  of  the  experimentation. 

No  single  technique  can  be 
expected  to  solve  all  types  of  integer 
problems.  But  our  preliminary 
experimental  results  are  encouraging 
and  suggest  that  a  truly  flexible 
interactive  system  has  potential  for 
aiding  in  the  understanding  and 
solution  of  integer  programs,  beyond 
what  can  be  expected  from  a  standard 
preset  program. 

The  paper  is  not  intended  to  give 
all  algorithmic  details.  In  section  2 
an  overview  of  techniques  and  features 
is  given,  with  references  to  expository 
papers.  For  convenience,  certain  key 
concepts     are  briefly  summarized  in  an 


apoend  i  x . 

What 
a  re  certain 
at  the 
informat  Lon 


counts  here  is  that  the 
techniques  for  "lookin 
problem,  gleaning  mo 
from     it     in     s  impl e  fo 


(e.g,  in  terms  of  simple  logic 
relations  among  variables,  simp 
inequalities,  stronger  bounds,  etc.) 
i  ud  1  c  i  ous  act  i  on  a t  a.  comou t' 
term  1  na  1  .  guided  by  the  printout  of  tl 
"current"  state  of  the  solutli 
process.  The  user  may  find  some  he 
from  the  example  presented  in  sectli 
3. 

Can     interactivity    play    a  maji 
role  in  solving     difficult  problems?  I 
increasingly  believe  that  the  answer 
yes   if  the  user    of     the  system  is  we 
versed     in   integer  programming.  Whethi 
one     shall       be      able  eventually 
construct  useful    interactive  system  f( 
the         non-specialist         Is  somewh; 
difficult     to     predict      now.      We  ai 
optimistic,   but  much  work  remains  to  ( 
done . 

2  .   Techn I ques  of  t he  S vs  tem 

At  present     the  following  feature 
and  techniques  have  been  incorporated 

2  . 1     L  inear  Programming     ( LP ) , 

with  possible  addition 

Gomory-Johnson  cuts  and  due 
reopt I  ml za  1 1  on  until  either  no  ree 
progress  Is  made  any  more  (In  tern 
of  changing  objective  function),  c 
until  the  number  of  reopt i mi za 1 1  or 
or  cuts  reaches  upner  bounds  impose 
by  the  user.  <l,2,3,it> 

2 . 2     Branch  and  Bound  Programming   ( BB) . 

with  propagat  ion  on  nonbasi 
(possibly  prefe  r red )  variables, 
single  LP  is  solved  at  the  curren 
origin  (or  node)  of  the  search  tre 
and  pena 1 t I es  are  computed.  A  branc 
on  a  nonbasic  variable  set  at  It 
optimal  LP      value       is  calle 

propagat  i  n;: .  One  will  want  t 
propagate  as     long  as  the  "al ternate 
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pena 1 ty  is  large,  <5,5,7>, 
Bende  rs    I  nequa 1  i  t  i  es , 


of 
to 


an 
be 


which  are  generated  at  the  end 
executed  LP.  They  are  intended 
used  either 

after  solution  of  a  problem 
within  a  BB  or  Enumeration 
scheme  (usually  with  some  aid 
from  cuttin;^  planes),  or 

after  guesses  at  partial 
integer  solutions  have  been 
entered  by  the  user  in  order  to 
get  logical  conditions  which 
might  characterize  the  problem. 
After     a     guess   is  entered,  the 


(  i  ) 


(  i  i  ) 


rema i n i ng 
resol ved 
i  nequa 1 i  t  y 
final  LP 
fees  i  b 1 e  or 


1 i  nea  r  program  i  s 
and  the  Benders 
is  taken  from  the 
tableau  (whether 
i  n  f eas  i  b 1 e ) . 


Note:  it  may  not  be 
to  use  cuts   in  such 


pe  rmi  ss  i  bl e 
an  instance. 


in  general.  Benders  inequalities  are 
collected        and  reta  i  ned  for 

exploitation  in  terms  of  reduct  i  on 
(see  2.I+,  2.5  below).  In  contrast, 
Gomory- Johnson  cuts  (and  we  believe 
other  cuts  of  similar  genesis  as 
well)  are  not  well  suited  for 
reduction  (i.e.,  they  lead  to  logical 
of  large  degree).  After 
LP,     they  are  therefore 


i  nequa 1 i  t  i  es 
use     in  an 


usually  discarded.  <8,9> 
,Jt_    Reduct  ion  Procedures, 

are   invoked  throughout  to 

(i)  shrink  bound   intervals  ,  and 

(ii)  to  generate  logical  inequalities 

The  interval  reduction  procedure  is 
used  iteratively,  screening  all 
constraints  so  as  to  compute  the 
tightest         possible  bounds  on 

structural  and  slack  variables.  Great 
care  must  be  taken  to  avoid  round-off 
difficulties;  i.e.,  one  must  use 
appropriate  scaling  and  tolerances 
(as  in  LP),  especially  when  Benders 
inequalities  are  used  for  reduction. 
<9,10,11,12,13> 

|Z . 5     Log  i ca  1    I  nequa  1  i  t  i es  . 

Given  any  inequality  in  (0,1)  or 
integer  variables,  one  can  derive 
from  it  a  set  (possibly  empty,  but 
not  usually  so)  of  m  i  n  i  ma  1  preferred 
va  r  i  abl e  i  nequa 1 i  t  i  es  (m. p . i . ' s ) .  The 
deg ree  of  the  system  ("size"  of  the 
smallest  inequalities;  the  degree  of 
an  inequality  i  will  be  called  Pl(i) 
in  the  printouts)  is  a  good  measure 
of  the  tightness  of  the  problem.  The 
bound  interval  of  one  of  the 
preferred  variables,  at  least,  must 
be  shrunk  by  one  unit. 


These  logical  inequalities  are  among 
the  main  tools  for  guiding  both  the 
BB  scheme  and  the  enumeration  (see 
also  "penalty  improvement"). 

<9, 10,  lit,  15> 

As  the  referee  points  out,  there 
exist  logical  inequalities 

("Boolean",  "Canonical"  inequalities; 
e.g.,  see  <16,17>)  which  are  more 
general  than  m.p.J.'s.  They  are 
especially  important  in  the  complete 
characterization  of  the  underlying 
problem.  However,  their  overabundance 
may  present  computational 

difficulties.  We  believe  we  have  good 
reasons  for  preferring  to  work  with 
the  more  special  properties  of 
m.p.i.'s  (used  computationally 

a  1  ready   in  <9 > ) . 

2 ■ 6  Enumeration. 

is  based  on  the  additive  a  1 gor  i  thm  of 
Balas  <18>,  modified  in  a  number  of 
ways,  with  the  search  starting  at  the 
origin  y= ( 0, 0, . , . , 0 ) .  Branches  are 
restricted  to  m  i  n  i  ma  1  orefe  rred 
va  r  i  ab 1 e  sets .  The  actual  set  to  be 
used  is  the  un  i  on  of  those  preferred 
variables   (in  minimal   preferred  sets) 


which  have 
propert  ies : 

1  St  priority: 

2  nd  priority: 

3  rd  priority 
va  r  I  a  bl  es  . 

(a  branch  is 
degree 
to  be 


good 


contraction 


double  contraction 
single  contraction 
(defaul t) :     all  free 


"contracting"  when  the 
of  the  system  is  guaranteed 
decreased      as     a  result. 


Contraction  can 
the       set  of 
i  nequa  lit  i  es  ) . 
chosen  set. 


be     determined  from 
mi  n I  ma  1       p  re  fe  r  red 
t  h  i  n       the  finally 
the       actual  branch 
as  to 


variable  is  selected  so  as  to  lead  to 
minimal  (maximal)  overall 

infeas i bi 1 i ty  of  the  problem  at  the 
next  node  in  phase  1  (phase  2)  of  the 
enumeration.  <10> 


In  the  above  context,  phase  1  is 
meant  to  be  the  period    during  which 

directed  towards 
feasible  solutions, 
is  to  dea 1  with  the 
optimality  (possibly 
tolerances)  once  a 
solution       has  been 


the  search  is 
finding  improved 
whereas  phase  2 
es  tabl  i  shment  of 
within  certain 
good  integer 
generated. 


Note:  These  branching  criteria  can 
be  rendered  inoperative  when 
the  "local  search"  (see  2.10) 
identifies  an  improving 

d  t  rect  i  on . 

2  .  7     State  Enumerat  ion, 

differs  from  enumeration  as  follows. 
At  any  node,  a  state  is  determined 
(usually  by  rounding  of  an  LP 
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solution).  It  is  essentially  a 
particular  value  for  the  integer 
vector  y,  say  y,  believed  to  be 
close  to  feasible  integer  solutions. 
In  the  zero-one  case,  the  search 
variables  for  an  enumeration  such  as 
that  of  2.6,  are  then  taken  to  be 
y(j )  If  y(j )  =  0  and  1  -  y(j  )  if 
y(j)  =  1.  (See  <12>  for  the 
generalization  to  the  integer  case 
with  small   bound  intervals). 

I.e.,  the  final  enumeration  is 
carried  out  with  transformed 
variables  y(j )  set  at  0  initially 
and  increased  on  forward  steps  of  the 
algorithm  (as  in  <18>).  In  dvnamic 
State  Enumeration,  we  can  envisage 
that  the  search  variables  y(j  )  may 
be  redefined  over  the  free  variables 
at  each  node,  provided  that 
reasonably  good  information  is 
available  for  doing  so. 

I7e  have  found  State  Enumeration,  with 
strategies  roughly  as  outlined  in 
2.6,  2.7  (there  are  many  other 
possibilities  of  sometimes  highly 
complex  nature),  highly  effective  in 
finding  good  feasible  integer 
solutions  fast,  especially  when 
augmented  by  a  simple  "one- 1 evel " 
local  search  and  by  a  "ce  1 1  I  n;i  tes  t" 
(see  <2.8,2.9>).  <3,i*,li*> 

2^     nei  1  ini;  Test 

Host  enumeration  schemes  have  tests 
Involving  the  objective  function  of 
the  initial  tableau  (often  with 
non-negative  cost  coefficients).  In 
state  enumeration,  the  original 
objective  function  becomes  less 
interesting  and  can  be  taken  care  of, 
anyway,  as  a  special  Benders 
inequality.  Instead  of  the  original 
objective  function,  we  have  found  It 
useful  to  concentrate  on  the  final 
objective  function  row  from  the 
optimal  tableau  of  an  initial  LP 
problem  (with  cuts  added  when  deemed 
desirable).  The  row  is  retained 
(coefficients  and  all  necessary 
Information  about  the  nature  of  the 
nonbasic  variables)  and  permits  the 
fixing  of  variables  and/or  shrinking 
of  bound  intervals  for  the  nonbasic 
variables,  among  them  original  slack 
variables  as  well  as  structural 
variabl es . 

It  should  be  noted  that  the  shrinking 
of  slack  bound  intervals  can  be  used 
as  Input  to  the  various  reduction 
procedures  and  can  lead  to  further 
information  about  all  the  variables 
t  he  re  . 

2 . 9     Bounds  on  Slacks;   Compress  I  on 

Our  linear  programming  system  and  all 
related  integer  programming 


procedures   permit   the  imposition 
upper  and  lower  bounds  on  the  slack- 
Equality    constraints,     for  exampT 
are     handled     b,    Impusitlon     of  ze 
bounds  above  and  below.       The  mini,, 
Inequality     reduction  procedures  wc 
with      Inequalities      only,  and 
generate     two     such  inequalities 
input  tu     reduction  whenever    we  kr 
(or  believe)   that  the  upper  bounds 
the  slacks  are     "true"     (I.e.,  trL 
restrictive)  bounds. 

Some  of  our  sample  problems  are  knc 
to  have  two  inequalities  represent! 
actually  only        one  equall 

constraint.  The  system  therefore  H 
a  "compress"  function  detecting  sl 
situations  an^  (re)creating  t 
(original)  equality  constraints. 

2.10  Loca 1  Search 

Any  BB  or  Enumeration  algorithm  n 
benefit  from  a  search  around  t 
"current"  point  under  consideratic 
In  the  present  system,  the  "depth" 
the  local  search  Is  controlled  by  c 
parameter  (LEV,  or  %  ).  The  value 
corresponds  to  no  search  (just  one 
the  current,  point  considered),  1 
value  1      corresponds       to  1 

alteration  of  all  f  "free"  (r 
fixed)  variables  by  one  unit  at 
time.  In  general,  one  enumerates  1 
(f!)/(f-'X)!  adjacent  points.  In  vi 
of  the  exponential  Increase  in  1 
number  of  such  points,  only  1 
values  1  and  2  and  conceivably 
appear  to  be  reasonable  for  . 

2.11  Heuristics  for  Feas  ibl e  Sol utions 

Several   heuristic  methods     have  be 
programmed    which     try      to  generj 
feasible     solutions     to  a  system 
linear  Inequalities   In  0-1  variable 
One    of  them    starts  with  a  given  ( 
vector     and     modifies     the  compone 
which     maximally  decreases  the  sum 
the       I n f eas  I  b  i  1  I  ties  or  the  number 
i  n  f  eas  I  b  I  1  I  ties  .         Such       steps  i 
repeated     as  often  as     specified.  C 
way  of  using  s uch , mod  1 f I ca t i ons  is 
have    a  complete  "forward     pass"  o\ 
all     the     variables,     followed  by 
complete  "backward     pass".     I.e.  i 
variables  are  altered     In  a  pass,  i 
sequence  being  determined  by  one 
the  criteria  mentioned  above,  <19>, 

Another  heuristic  concentrates  on  i 
logical    inequalities  of     small  degi 
and     tries     to    generate    a  feasll 
solution     to  the     current  system 
minimal   preferred   inequalities,  wh' 
can  then  be  used  as  a  starting  po 
for     the     first  heuristic,     or  for 
local   search,  or  as  a     state  for  i 
state  enumeration  algorithm.  <3,k,'. 
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. 12     Row  Comb i  nat  ion 


It  has  been  known  for  a  long  while 
that  linear  combinations  of  rows  may 
be  more  interesting^  than  the  rows 
themselves,  but  there  is  little 
insight  into  how  the  inequalities  are 
to  be  generated  (possible  exceptions 
are  references  <20>  and  <21>). 
Consistent  with  the  main  emphasis  of 
this  paper,  we  have  chosen  to  take 
the  deg ree  of  an  inequality  as  its 
basic  measure  of  strength.  Hence  we 
allow  for  "visual"  combination  of 
(  ^  )  inequalities  (e.g.,  so  as  to 
have  coefficients  of  opposite  sign 
combine  to  produce  small  results  in 
magnitude  on  the  left  and  lar^re 
negative  right  hand  sides,  if 
possible),  and  we  test  the  resultant 
inequality  by  reduction. 

Alternatively,  we  also  have  heuristic 
algorithms  of  modest  capability  which 
utilize  minimal  preferred 

inequalities  to     give     automatic  row 


combinations,  with  reduction  used 
again  to  check  out  the  results  and 
terminate  the  process. 

2.13   I mprovemen  t  of  Pena 1 1  i  es 

Penalties  can  be  improved  by  taking 
into  account  logical  conditions  on 
the  integer  variables,  such  as  the 
minimal  preferred  inequalities.  If 
some  nonbasic  variable  must  be 
modified  ,  it  may  be  the  case  that  at 
least  some  of  a  set  of  other  nonbasic 
variables  must  also  be  altered  to 
achieve  feasibility.  This  usually 
leads  to  the  accrual  of  an  additional 
penalty.  One  may  utilize  improved 
penalties  a  priori  in  a  preprocessing 
of  the  penalty  table  for  BB 
programming,  or  dvnami  ca 1 1 v  to  store 
nodes  with  increased  alternate 
penalties  in  BB  with  propagation. 
<6,  7> 


Aa  Exampl  e  of  a.  Terminal  Sess  ion 

1 )  5TART1 

BM1515  is  a  (16  by  16)  tableau,  of 
the  form 


Co 
-B 


corresponding  to  the  minimization  of 
z  =  Co  +  C  .   y,   subject  to: 


A   .   y  £  B 

0       y       1,  all   y(j  )  integer 

It  represents  a  15  variable  problem 
of  moderate  difficulty,  with  15 
constraints  and  a     dense  constraint 


ma  t  r  I  X 
#13). 


(taken     from     <22>,  problem 


STARTl 
ENTER  M,N 

15  15 


ENTER  TABLEAU,    EXPANDED  FORK 
□-•-S  Ml  5  1  5 


0 

7 

1 

3 

4 

2 

6 

2 

1 

5 

1 

5 

7 

2 

7 

36 

2 

6 

1 

0 

3 

3 

2 

6 

2 

2 

5 

3 

3 

7 

9 

22 

"5 

5 

"8 

3 

0 

"l 

"3 

"8 

"9 

3 

~8 

6 

~3 

"8 

~6 

3 

5 

~6 

"5 

~3 

"8 

8 

"9 

"2 

0 

9 

"l 

7 

9 

~9 

4 

~1 

~9 

"5 

0 

9 

"l 

8 

~3 

9 

9 

3 

0 

"7 

5 

4 

"9 

10 

8 

"7 

1+ 

5 

9 

"1 

7 

1 

~3 

2 

0 

~3 

"5 

9 

7 

12 

7 

5 

2 

0 

6 

6 

7 

6 

~7 

"7 

~1 

~8 

"3 

9 

1 

1 

"l 

3 

3 

1 

0 

4 

"l 

"6 

0 

8 

0 

1 

"5 

"4 

2 

~1 

2 

"6 

9 

0 

7 

"9 

9 

6 

~4 

5 

3 

1 

~3 

"9 

3 

"6 

~7 

"2 

~2 

0 

"6 

~6 

7 

"4 

0 

2 

8 

0 

4 

"2 

0 

0 

0 

0 

0 

0 

0 

0 

0 

"l 

"1 

~1 

"l 

1 

1 

1  4 

"l 

3 

3 

"4 

5 

5 

7 

8 

"8 

0 

0 

"1 

~2 

4 

5 

~9 

8 

0 

5 

0 

2 

6 

7 

1 

"1 

1 

4 

"5 

"7 

"8 

2 

2  5 

"2 

~7 

~8 

6 

~2 

"5 

"9 

"2 

2 

"8 

1 

"3 

5 

6 

"8 

3 

7 

9 

~7 

~5 

1 

~5 

"5 

4 

9 

3 

~4 

0 

0 

7 

"7 

32 

7 

1 

~3 

0 

6 

~3 

7 

8 

1 

~6 

6 

8 

"3 

8 
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2 )    Initial    LP .    Prep roces s  i  . 


No     variable  can     be     fixed  by 
reduction  in  the  initial 

preprocessing.  The  continous  LP 
optimum  is  If*. 96  and  5  variables 
out  of  15  are  fractional.  A  Benders 
inequality  is  computed  from  the 
optimal  LP  tableau  and  printed  out. 
(ZSTAR  stands   for  the  best  objective 


function     value     found,     or     for  arJ 
upper  bound  supplied  by  the  user.  We 
use  a   larj^e  value  as   the  default). 

One  enters  a  "state"  STl, 
usin;^  a  rounded  LP  solution  with 
rounding  parameter  p  =  .5  ,  (entries 
L.keep  y(j),  entries  0..use 
complemented  variables  1  -  y(j))  . 


BENDERS  INEQUALITY  OF  TYPE  1 

(  5.844  0  "6.122  6.741  3.03  0  0  ~1.121  0  0  0.7512  2.862  8.055 
1  .  963    "0.7163    )    X    y   <   "22  .  92  +ZSTAf! 


ST1=^1  00111101111110 


3 )  Guesses 

One  is  prompted  for  guesses. 
One  enters  indices  and  cor  res  pond  i  n.t; 
values  for  a  subset  of  the  integer 
variables  (unspecified  variables 
being  left  unconstrained)  and 
resolves  one  resulting  LP. 

The  final  LP  tableau  yields  a 
Benders  Inequality  of  type  1  if 
feasible,   of  type  2   if  infeasible. 

The  function  GUESS  works  with  the 
last  Benders  inequality. 

GUESS  0^  chooses  the  free  variables 
with  nega  t  i  ve  coefficients  and  sets 
them  to  0.  The  guess  is  then  likely 
to  be  infeasible  and  a  new 
feasibility  condition     will  be 

generated   (8.1.  of  type  2). 

GUESS  1_  chooses  the  free  variables 
with  positive  coefficients,  and  one 
sets  these  variables  to  1.  In  this 
case  also  one  constrains  the  current 
B.I.  and  tries  to  obtain  tighter 
feasibility  conditions  on  a  subset 
of  the  variables. 


Any  given  guess     leads     to  thf 
resolution     of     one  linear  program, 
While       the     Benders     inequality  i; 
retained      for      further  processinj 
(possibly      after    a     check      as  t( 
whether     it   is  strong    enough,     i .e 
has  degree     low    enough).     For  largi 
problems,     the     guesses     will  mos 
likely  be  generated  automatically 
according  to  a  scheme  which  the  use 
may  have     developed     for     a  smalle 
problem  of  similar  structure. 

Having     some  familiarity  with  . 
given  problem  (and  by  that  we  do  no 
mean  knowing   its  solution),     one  cai 
usually  think  of  a     number     of  othe 
promising     guesses.     E.g.,     one  ma 
want  to  set  to  one  some  variable(s 
with     large  positive  coeffient(s)  o 
at     zero  some  variable(s)  with  largi 
negative    coefficients;  or  one  make 
guesses     in     consonance     with  one' 
knowledge    of    any  possible  specia 
structure . 


WANT  TO  GUESS  AT  SOLUTION? 
YES 

ENTER  GUESS  FOR  Y,   FIRST  INDICES,    THEN  VALUES 

GUESS  1 
2    10  12 
□  : 

1 

ACTUAL  NUMBER  OF  ROWS  FOR  MAT  :18 
PROBLEM  NOT  FEASIBLE 
DETERMINANT=  1.004E'6 
BENDERS  INEQUALITY  OF  TYPE  2 

(    36   "1.563   ~10.11    0    11.02    5.74   0    0   ~6.753   25.63   0   6.149    0    0  0) 
X   Y  ^  12.12 

I 
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WANT  TO  GUESS  AT  SOLUTION? 
YES 

ENTER  GUESS  FOR  Y,   FIRST  INDICES, 
7 
0 


11.47  SEC 

OBJECTIVE  FUNCTION   =  15.12 
NO.    OF  PIVOT  STEPS  =29 
DETERMINANT=  1.045E6 
STRUCTURALS:    0.0112  6   0.7627   10  0 
0   0    0  1 

BENDERS  INEQUALITY  OF  TYPE  1 
(    0    0   "9.671    8.708   2.266    0  "3.909 
"6.51+8    )    X   y  s   "32  +ZSTAR 


THEN  VALUES 


0.  0871+  0  1  0  .  381  3  0.  2613  0  .  1264 
"0.6645    0    0   0    3.872    11.21  2.857 


k)  After  all  guesses  have  been  entered, 
and  the  correspondin;;  LP's  have  been 
solved,  with  a  final  Benders 
Inequality  always  adjoined  to  the 
tableau,   PREPROC  checks  for  interval 


reduction  and  fixing  of  variables. 
In  this  problem  variables  yl,  y9  and 
yl5  can  be  fixed  at  0,  1  and  1  , 
respect  i vel y. 


N.   VARS.   FIXED  3 

Ly2[l;]  Slower  bound 

000000001000001 

LU2L2il  (uf=ft.r  boMM^s) 

011111111111111 
t  T  t 


5)  The  starting  LP  for  the  BB  (or  an 
Enumeration)  procedure  is  then 
solved         with  these  values 

substituted,  and  the  initial 
objective  function  value  is  thus 
raised  to    15.7  . 


Dual  reopt i mi za t i on  after  the 
generation  and  temporary  addition  of 
Gomory- Johnson  cuts  finally  brings 
the  initial  BB  objective  function  to 
18.39  . 


OBJECTIVE  FUNCTION   =  15.7 
NO.    OF  PIVOT  STEPS  =25 
DETERMINANT=       4.9  6  7i£'5 

STRUCTURALS:    0   0.8451   1   0   0   0.1926   0.1758   0.3951   1   0.2186   0   0   0  0 
1 

BENDERS  INEQUALITY  OF  TYPE  1 

(    7.362    0   "4.894   5.722   1.515    0    0   0   3.848    0   0   1.453    8.021  3.042 
1.545    )   X   y  <  "15.1  +ZSTAR 
5     CUT(S)  GENERATED 


OBJECTIVE  FUNCTION  =  18.39 
NO.    OF  PIVOT  STEPS  =87 
BENDERS  INEQUALITY  OF  TYPE  1 

(    10.25    0   "3.911    3.535    0.09996    0.06115    0    0   4.343    0   0    0  5.384 
0.  004794   1  .  39    )    x   Y  <.   "16.57  +ZSTAR 

STRUCTURALS:    0    0.1975   10    0   0   0.6106    0.4917   1    0.1856    0.101  0.5849 
0    0  1 
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6)  M.  starts.  The  system  computes 
penalties  (added  to  the  current 
objective  function  value  to  give  a 
maximal  1  o\ve  r  bound  on  the  final 
objective  function  value). 

In     the     normal  BB       mode,  the 

generated  node  (i.e.,  the  current 
bounds  and  z  +  penalty)  is  stored  in 
the  node  table,  and  the  next  node 
would  be  brought  in. 

In  the  propagat  i  on  mode  one  looks 
for  variables  to  be  fixed  at  their 
current  value,  with  an  al ternate 
node  stored  in  the  table,  and  the 
cur  rent  node  being  processed 
further. 

In  this  example  a  propagating 
variable  y(13)    is   identified  and 


the  alternate  (y(13)=l;     often  othe 
conditions  can  be     imposed  from  tht 
minimal     preferred     inequalities)  ti 
stored  with  penalty  S.'SBk. 

The  user  is  asked  whether  he  accepts 
the  propagation.  If  the  answer  i; 
yes,  the  propagation  proceeds;  i1 
not,  the  user  has  the  choice  oi 
proceeding  in  the  norma  1  BB  mode  od 
of  trying  to  finish  the  currenli 
problem  (In  core)  by  stat« 
enumeration.  In  the  present  case  we 
accept  the  first  propagation 
(storing  an  alternate  node  witl 
penalty  5.38it)  and  reject  the  secon< 
possibility  with  alternate  penalty 
3.911  (the  system,  of  course, 
generates  the        penalties  tr 

descending  order  of  magnitude). 


PRP  ,PEI1  ,CTn   13    5  .  3  84  0 

ACCEPT  ? 
YES 

OUT:  CSTbi^e  /VoT>e) 

********************** 
1    23  .  77    5  .  384   ~13  (Nor,e 


C  PRP=)i  «-»  PKopAGrATE.  With  vii^ 
C  PEHs  5.394  «  lo  \TH   AL-reR.f«j  ft  te 


PRP ,PEN ,CTR  ~3   3.911  0 

ACCEPT  ? 
NO 

FOR  ALTERNATE  CALL  STRAT,   BP,   NBP  HERE 

SAVE  NOW,   BEFORE  ENUM. 


7)  Enumerat  ion  s  ta  rts  (from  the  state 
STl).  Variable  13  Is  at  0  (via 
propagation),  and  yl,  y9,  yl3  have 
been  fixed  globally.   One  branches  on 


y8=l,  and  a  feasible  solution  Is 
found  Immediately  (undoubtedly  due 
to  the  imposition  of  the  state), 
with  obj     function  z=26. 


Pi"   4442243231053434104222101021032105    (ro^  cl&5>vetS 
K  IS   1       C  \oc«l     s-eo-r-cU  level) 
Z  26 

FEASIBLE  SOLUTION  2  6 
SOLUTION  FOUND  AT  NODE     2   LEVEL   1   PHASE  1 

OBJECTIVE  FUNCTION  2  6 
NEW  ZSTAR  =2  5 

SOLUTION  011000111001001 
INT.   SOL.   FEASIBLE     ,    Z=  2  6 


8)  Enumeration   is  compi eted  after  examination  of  a  total  of  17  nodes. 


V/BND=     1     INF.    IN  BOUNDRD      Q  IN    (3ounj>  kesoct/ow} 
END  OF  ENUMERATION 
****************** 
. OPTIMAL  SOLUTION  011000111001001 
OPT.    OBJ.    FUNCTION  2  6 
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9)  Return  to  BB . 


The  stored  alternate  node  Is  brour^ht 
back  from  the  tree  into  high  speed 
storage.  After  imposition  of  the 
bounds,  the  inequality  system 
(including  Benders  Inequalities) 


with  z  £  z*£l=  26-1  =  25,  is  now 
infeas  ibl e.   End  of  BB. 

Term i nat  i on  of   integer  program. 


Ill  OF  NODE  1 
***************** 

1   2  3  .  77   5.384   "l3     (HodE  NUfMSB?.   OBX  F.  VALue  ?eN.  .BfZ.'twfH) 


RED  ('RFEuctidh; 
NOPV  0     Cn     BF  system). 
pPLUS   2  0  15 
ItlFEASIBLE-JWPV  0 


Compu  ta t  i ona 1   Resul ts 

There  are  a  number  of  problems 
which  can  be  entirely  resolved  by  the 
use  of  one  or  the  other  technique.  For 
example,  a  (28,35)  problem  of  the 
tanker  scheduling  variety  (see  <9>) 
yields  an  integer  solution  after 
imposition  of  a  total  of  6 
Gomory- Johnson  cuts  in  2 

reopt  imi za  t  ions . 

The  same  problem  yields  well  to 
state  enumeration  techniques  (11  BB 
nodes  of  the  propagation  type,  i.e. 
with  only  one  true  linear  program 
resolved,  followed  by  betv^^een  9  and  33 
enumeration  nodes  depending  on  the 
initial  state;  at  the  end  none  of  the 
alternate  BB  nodes  need  to  be  processed 
f urt  he  r ) . 

On  the  other  hand  it  requires  a 
large  number  of  LP  optimizations  in  a 
straightforward  BB  optimization  run 
(between  40  and  109  linear  programs 
depending  on  strategy). 

In  the  following  table  we  describe 
various  approaches,  to  the  (15  by  15) 
problem  used  as  example  in  the  text.  In 
some  respects  it  is  rather  difficult: 
the  gap  between  LP  obj .  function  and 
integer  solution  objective  function  is 
large;  the  penalties  are  rather  small  ; 
the  response  to  Gomor y-Johnson  cuts  is 
only  moderate;  the  degree  of  the 
initial  system  is  3.  It  behaves  much 
better  when  Benders  Inequalities  are 
added,  permitting  the  reduction  of  the 
degree  to  1  (fixing  of  three 
va  r  i  abl es ) ;  etc . . 

In  the  table  we  summarize  a  number 
of  runs  using  some  or  all   of  the 


available  tools.  One  does  or  does  not 
add  Gomor y- Johnson  cuts.  One  does  or 
does  not  enumerate  at  a  selected 
pending  node  of  the  BB  tree.  The  state 
for  enumeration  can  be  chosen 
arbitrarily:  we  ran  the  program  with 
the  state  determined  from  the  rounded 
LP  solution,  with  all  fractional 
variables  less  than  or  equal  to  .5 
fixed  at  0,  and  all  variables  larger 
than   .5   f  i  xed  at  1 . 

1/e  entered  some  guesses  at  a 
solution  and  appended  the  resulting 
Benders  inequalities  to  the  original 
tab  1 eau . 

Which  of  the  results  are 
considered  to  be  best  wfll  depend  on 
the  relative  computational  efforts,  and 
therefore  on  the  computer  and  on  the 
system  used  (APL  in  our  case).  In 
general,  we  feel  it  most  essential  to 
reduce  the  number  of  linear  programs, 
but  even  more  fundamentally,  to  avoid 
large  trees  (many  BB  nodes,  large 
enumeration  levels),  i.e.  situations  in 
which  the  combinatorics  overwhelms  the 
problem  solver. 

The  interactive  system  appears  to 
give  the  insight  and  often  the  tools 
for  avoiding  exponentially  intractable 
situations;  i.e.,  even  for  more 
difficult  problems  (e.g.,  the  (37  by  7k 
problem  of  <9>),  which  we  did  not  run 
to  conclusion  because  of  slow  APL 
response  at  the  terminal,  it  was  clear 
that  the  search  was  "well-behaved",  the 
tree  remafning  a  "low-level  tree"  , 
with  steady  progress  being  made  . 
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Initial    LP  va 1 ue 

ai* .  96 

1 14 .  9  (3 

1 1+ ,  9  G 

lU.OG 

II4.  95 

lU.  06 

Total   n.  of  cuts 

0 

o' 

0 

1 

1*7  i 

hi 

reopt  i  mi  za  t  i  ons 

0 

0 

0 

7 

7 

7 

Final   va 1 ue  of  LP 

1 1+ .  G  6 

14.96 

lit. 05 

16. G3  ^ 

16.53 

13.30 

1  ilLc^rri      \J\J  L  1  lilUliI 

2  5 

7  n  i 

^  0 

Use  H-B? 

no 

yes 

yes 

T 

yes 

yes 

yes 

N.   of  nodes  generated 

13 

2 

2 

2 

2 

11 

2 

2 

2 

2 

Optimum  found  at  node 

12 

  .. 

Use  enumeration?  yes 

no 

yes 

yes 

yes 

yes 

State  for  enumeration 

from  LP? 

yes 

yes 

yes 

yes 

1? 

yes 

0? 

j 

N.  of  nodes 

105 

63 

37 

19 

17 

Optimum  foiind  at  node 

73 

31 

25 

10 

2 

Use  guesses? 

no 

no  ! 

no 

i 

yes  1 

yes 

yes 

Type  of  guesses 

random! 

N.  of  guesses 

10 

6 

10 

N.  of  Benders  ineq. 

12  ' 

13 

12 

N.  of  variables  fixed 

0  ' 

1 

3 

H.   of  LP  solved 

0 

25 

2 

 T 

9  i 

7 

10 

Table 

Six  Approaches   to  Solvinn:   thn    (15,15)   Sf>:-iolp  Prc.blen 


5 .  Append i  X :   Definition  of  Some  Terms  <10> 


A  1 og  i  ca 1  inequality  in  bivalent 
variables  is  meant  to  be  an  inequality 
with  0,   1,   -1  coefficients  only. 


A        prefe  rred 
(abbrev  iated 
inequa 1  i  ty  of 


variable  inequality 
p .  i .  )  is  ^a  1 og i ca 1 
the  form  j^y(j  )  ^  1  / 
y(j  )  being  either  y(j  )  or  1  -  y(j  ), 
which  expresses  the  condition  that  at 
least  one  of  the  y(j ),  j  «  J,  must  be 
one . 


The  values  y^(j  )  =  1  (i.e.,  either 
y(j  )  =  0  or  y(j )  =  1,  depending  on  the 
nature  of  the  ^(j))  are  termed  the 
sugges  ted  or  indicated  values  of  the 
preferred  variables. 


Given  some  procedure  for  generating  a 
set  of  preferred  inequalities,  we  call 
m  i  n  i  ma  1      (relative     to     the  procedure) 

minimal     number  of 


those         p.I.'s  with 
non-zero  coefficients. 

The  degree  _d  of  a  m.p.i.  system 
extension  the  degree  of 
inequality  or  system  of 
from  which  the  m.p.i 
derived)  is  the  number 
coefficients  of  one  of 
preferred  inequalities. 


i 

(and  by 
the  initial 
i  nequa 1 i  1 1  es 
system  was 
of  non-zero 
the  minimal 
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ABSTRACT 

In  this  paper  we  study  an  accelerated 
version  of  Powell's  conjugate  direction 
technique  for  solving  unconstrained  non- 
linear problems.     This  technique  employs  a 
method  of  taking  large  steps  to  enhance 
movement  along  ridges.     Not  only  does  this 
technique  improve  the  speed  of  convergence 
for  nonquadratic  problems,   but  it  improves 
the  robustness  of  the  procedure.     An  APL 
code,   useful  for  research  because  of  its 
interactive  capabilities,   is  presented  and 
described  in  detail. 

I  INTRODUCTION 

The  topic  of  nonlinear  unconstrained 
problem  solving  is  by  no  means  new  to  the 
OR/MS  literature.     Indeed,  many  of  the 
basic  concepts  of  solving  this  type  of 
problem  have  been  known  for  some  time. 
Beginning  approximately  in  1960  there  has 
been  a  proliferation  of  papers  on  uncon- 
strained optimization  techniques.  Of 
these,   one  of  the  best  known  and  most 
extensively  tested  is  Powell's  method  of 
conjugate  directions.     This  paper  studies 
Powell's  technique  and  offers  a  modifica- 
tion that  our  experiments  show  improves 
the  rapidity  of  convergence  to  a  solution 
when  the  criterion  function  exhibits  non- 
linear ridges.     It  employs  a  code  that 
displays  the  robustness  of  the  technique 

*Queen ' s  University,  Kingston,  Ontario 
Working  Paper  No.   76-4,  May  1976. 


as  well  as  effecting  our  particular  means 
of  accelerating  convergence.     The  former 
is  demonstrated  by  a  discussion  of  the 
effects  on  program  operation  of  changing 
various  computational  parameters.     We  also 
demonstrate  the  usefulness  of  interactive 
languages,  APL  in  particular,   for  this 
kind  of  research. 

Powell's    (1965)    conjugate  direction 
method  is  chosen  for  several  reasons. 
Among  other  tests  of  Powell's  method. 
Box's    (1966)   paper  indicates  that  it  is 
competitive  with  several  other  methods 
on  a  variety  of  nonlinear  problems. 
Additional  discussions  of  conjugate 
gradients  can  be  found  in  Fletcher  and 
Reeves    (1964)  ,  Powell    (1965)   and  more 
recently  Zangwill    (1969).  Moreover, 
Powell's  technique  is  straightforward  and 
can  be  implemented  rather  easily  without 
requiring  either  large  amounts  of  core 
storage  or  extensive  manipulation  of 
stored  variables,   so  that  it  constitutes 
a  reasonable  choice  for  solving  large 
scale  problems.     Another  reason  for  the 
selection  of  Powell's  technique  is  that 
it  avoids  the  necessity  of  calculating 
derivatives.     Finally,   the  conjugate 
direction  technique  has  the  property  of 
finite  convergence  for  quadratic 
functions;    in  this  sense  it  is  similar  to 
other  techniques  including  those  by 
Fletcher  and  Powell    (1963)   and  Fletcher 
and  Reeves    (1964),   although  the  latter 


two  cases  make  use  of  explicit  calcula- 
tions of  the  first  derivative. 

The  outline  of  the  paper  is  as 
follows.     In  Section  II  we  briefly  dis- 
cuss Powell's  algorithm  and  the  minor 
modifications  we  made  in  setting  up  our 
comparison  standards.     In  Section  III  we 
examine  the  ridge  following  problem  and 
our  method  of  accelerating  the  conver- 
gence of  Powell ' s  technique  when  these 
ridges  are  nonlinear.     The  results  of 
tests  on  several  example  problems  are 
reported.     In  section  IV  we  discuss  some 
problems  caused  by  variation  in  certain 
input  parameters.     Section  V  presents  our 
position  regarding  the  usefulness  of  APL 
for  carrying  out  the  type  of  investiga- 
tion reported  here.     Finally  Section  VI 
presents  our  conclusions.     The  APL  list- 
ing of  the  code  that  generated  our 
results  is  included  in  an  Appendix. 

II     THE  TECHNIQUE 

In  this  section  we  present  only 
enough  of  Powell's  method  to  be  able  to 
identify  the  differences  between  Powell's 
original  code  and  the  version  we  have 
employed.     The  major  difference,  leading 
to  accelerated  convergence  in  problems 
with  nonlinear  ridges,   is  permitting 
Powell's  algorithm  to  commence  a  new 
iteration  at  a  point  different  from  (and 
usually  worse  in  terms  of  function  value) 
the  point  at  which  the  previous  iteration 
ended.     This  modification,  which  we  call 
a  leap,    is  discussed  fully  in  Section 
III.     In  order  to  demonstrate  the  com- 
parability of  our  results,   it  is 
important  to  indicate  the  additional, 
minor  changes  to  Powell's  technique  that 
we  have  found  useful.     Essentially,  these 
minor  modifications  are  designed  to 
permit  making  certain  choices  of  com- 
putational parameters  that  we  employed 
in  developing  the  accelerated  conver- 
gence procedure.     Powell's  basic  code 
may  be  expressed  in  the  following  steps. 

i)     For  r  =  1  to  n  calculate  a  minimum 
along  direction  d^  beginning  at  the 

resulting  minimum  of  the  last 
direction  searched.     The  initial 
point  is  arbitrary  while  the  start- 
ing directions  are  the  coordinate 
directions  of  the  domain  of  the 
search.     For  any  iteration  let  p^ 

be  the  first  point  and  p     the  final 
point . 

ii)     If  this  is  the  first  iteration,  re- 
compute the  one  dimensional  minimiza- 
tions using  the  initial  set  of 
directions.     Otherwise,   calculate  a 
new  direction  from  (p^  -  p^)  and 

minimize  the  objective  function  in 
this  direction.     The  resulting  point 
is  p     for  the  next  iteration. 


iii)     Of  the  r  +  1  directions  select  one 
to  be  removed  by  the  criteria  given 
below  and  return  to  i) . 

Our  modifications  to  Powell's  method 
involve  the  manner  in  which  directions 
are  removed,   choices  of  some  computational 
parameters,   and  convergence  criteria. 

Criteria  for  Removal  of  Directions. 
Powell  presents  a  scheme  for  determining 
which  direction  should  be  removed,  a 
scheme  that  includes  the  maintenance  of 
the  original  directions.     Our  criteria 
merely  removes    (from  the  set  of  r  original 
directions)    that  one  which  is  most  nearly 
parallel  to  the  new  one    (i.e.,  the 
smallest  angle  between  the  two  normalized 
direction  vectors). 


Tolerance  Parameters. 


We  use  the 


term  "tolerance  parameters"  to  refer  to 
those  parameters  which  the  operator  must 
choose  before  running  the  program. 
Powell's  technique  employs  well  defined 
rules  for  modifying  these  parameters 
during  the  operation  of  the  code,  but  the 
tolerance  parameters  themselves  are  rarely 
specified  in  any  discussion  of  the  tests 
of  proposed  codes    (in  Powell's  work  or 
elsewhere) .     We  believe  that  in  most 
instances  robustness  of  a  particular 
technique  is  an  important  feature  of  its 
operation,   since  if  particular  parameters 
must  be  determined  before  a  given  routine 
will  work  satisfactorily  it  can  imply  a 
great  deal  of  testing  on  particular 
functions  before  a  good  selection  becomes 
possible.     In  duplicating  Powell's 
technique  we  experienced  difficulty  in 
determining  the  exact  computational 
parameters  to  employ,   but  partially  over- 
came the  difficulty  through  using  the 
interactive  properties  of  APL  to  obtain 
comparisons  with  Powell ' s  reported  test 
results  on  an  iteration  by  iteration  basis: 

Five  different  parameters  were 
specified  for  the  operation  of  our  pro- 
gram.  In  the  listing  of  the  program  in  the 
Appendix  the  3  -  vector  TOL  is  used  to 
specify  the  first  three  of  these 
parameters.     These  values  are  also  used 
by  Powell  and  are  termed  m,   q  and  e  in 
his  paper.     The  last  two  parameters  we 
employ  are  unique  to  our  code  and  are 
called  MPASS  and  MTEST  respectively.  All 
five  parameters  are  used  during  the 
operation  of  the  one-dimensional  search 
technique  used  in  our  procedure.     A  brief 
description  of  these  parameters  follows: 

TOL    [1]    is  the  upper  bound  on  the 
length  of  a  step  taken  by  the  one 
dimensional  search  technique.     It  is 
invoked  whenever  a  quadratic  prediction 
is  further  away  from  the  points  used  for 
the  prediction  than  the  value  TOL  [1] 
permits . 

TOL   [2]    is  the  magnitude  of  the  step 
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length  along  the  direction  being  minimized. 
This  value  is  used  to  determine  the 
position  of  two  additional  points  needed 
for  the  first  quadratic  fit. 

TOL   [3]    is  the  required  accuracy  to 
which  variables  along  this  line  must  be 
determined  before  the  search  for  a  minimum 
will  be  terminated   (unless  other  termina- 
tion criteria  are  invoked) .     In  the 
operation  of  our  algorithm  each  of  the 
above  parameters  afJEects  the  search 
dynamically  in  that  they  are  employed  in 
conjunction  with  the  direction  vector 
currently  used.     The  end  result  is  that 
tighter  bounds  are  implicitly  invoked  as 
the  movements  of  the  variables  become 
smaller . 

MPASS  is  the  maximum  number  of 
function  evaluations  allowed  during  any 
normal  linear  search.     Our  use  of  this 
parameter  resulted  from  the  discovery  that 
the  number  of  quadratic  fits  used  to  pre- 
dict a  particular  minimum  along  a  line  was 
a  more  useful  parameter  than  TOL   [3] .  We 
find  that  the  number  of  iterations  is  much 
easier  to  control  directly  rather  than 
with  the  use  of  TOL   [3].     Moreover,  by 
determining  the  number  of  quadratic  fits 
we  use  for  any  iteration,   the  accuracy  of 
the  search  can  be  controlled  in  any  event, 
as  we  shall  show. 

MTEST  is  used  in  an  attempt  to  improve 
the  robustness  of  the  program.     In  the 
program  we  allow  for  a  reduction  of  TOL 
[2]   by  a  factor  of  ten  whenever  TOL   [1]  is 
invoked  or  when  a  predicted  minimum  turns 
out  to  be  larger  than  any  of  the  three 
predicting  points.     The  result  is  that 
within  a  single  linear  search  TOL   [2]  is 
scaled  down  if  unreasonable  predictions 
have  been  achieved.     MTEST  limits  the 
number  of  times  this  reduction  can  occur 
for  any  one  linear  search. 

STOPPING  RULES.     No  specific  routine 
was  written  to  stop  our  code,   so  that  the 
result  is  a  termination  by  default  when- 
ever no  new  minimum  is  generated  in  any 
conjugate  direction.     Stopping  under 
these  conditions  is  a  result  of  machine 
accuracy.     It  would  be  an  easy  matter 
either  to  invoke  stopping  under  less 
stringent  criteria  or  to  use  Powell's 
stopping  rules.     Because  of  the  nature 
of  our  research,   however,   we  chose 
merely  to  input  a  specified  number  of 
iterations  and  include  the  capability  of 
restarting  the  program  without  loss  of 
previous  information.     This  procedure 
might  also  prove  practically  useful  during 
batch  processing.     Especially,  the  ability 
to  stop  the  operation  and  perform  some 
cost-benefit  analysis  before  deciding 
whether  to  continue  the  search  might  be 
useful  when  the  nonlinear  program  is 
working  as  a  subroutine  to  a  more  general 
program. 


Duplication  of  Powell's  Results. 
Table  I  displays  the  results  of  using 
our  version  of  Powell's  code  on  the  three 
test  problems  reported  in  his  paper.  The 
results  of  our  code  use  TOL  =10,  .1, 
.001;   MPASS  =  4  and  MTEST  =  1.  The 
similarity  in  number  of  evaluations  is, 
of  course,  not  surprising,  and  is  included 
only  to  demonstrate  the  similarity  in  our 
procedures  as  used  so  far.     In  the  rest  of 
the  paper  comparison  standards  will  be  the 
results  obtained  using  this  version  of 
Powell's  code.     These  results  will  be 
compared  with  those  obtained  from  using 
the  leapsize  modification  we  have  devel- 
oped to  accelerate  its  convergence.  The 
nature  of  this  modification  is  discussed 
next . 

Ill     AN  ACCELERATION  TECHNIQUE  FOR  THE 
CONJUGATE  DIRECTIONS  SEARCH 

A  well  known  phenomenon  in  nonlinear 
search  is  often  referred  to  as  ridge 
following.     In  many  problems  the  search 
begins  to  follow  a  long  narrow  ridge  (or 
valley  in  the  case  of  minimization)  with 
the  result  that  the  program  runs  for  an 
inordinate  length  of  time  while  improving 
the  criterion  values  very  slowly.  The 
possibility  that  changes  occur  so  slowly 
that  the  program  terminates  far  from  an 
optimal   (global  or  local)    solution  is  also 
present  in  these  cases.     While  conjugate 
directions  work  well  in  the  case  of  linear 
or  near  linear  ridges,  when  the  ridge 
curves  conjugate  directions  can  tend  to 
zig-zag  considerably,  much  as  gradient 
directions  perform  in  the  case  of  quad- 
ratic functions.     Figure  I  provides  an 
hypothetical  example  of  a  typical  search 
pattern  in  this  case.     The  modification  of 
Powell ' s  technique  that  we  report  herein 
was  predicated  on  the  belief  that  it  might 
be  useful  to  continue  to  move  in  the 
general  direction  of  the  curvature  of  the 
ridge  for  some  distance  past  the  exact 
minimum  along  the  search  direction.  We 
conjectured  that  this  procedure  will  lead 
to  a  result  somewhat  like  Figure  lb,  (and 
thus  improve  convergence)   whenever  the 
function  is  not  too  closely  quadratic. 
In  these  circumstances,  the  ridges  become 
nonlinear  and  would,  we  expected,  slow 
down  convergence  of  the  unmodified  con- 
jugate directions  technique.  These 
conjectures  were  validated:     Figure  II 
provides  some  actual  results  of  a  search 
we  conducted  on  the  Rosenbrock  problem. 
The  results  of  this  and  other  searches 
are  discussed  below;   supporting  data  for 
Figure  II  appear  in  Table  II   (Leapsize  = 
1  and  3)  . 

In  our  modification  to  Powell's  code 
we  move  in  the  general  direction  of  the 
ridge's  curvature  at  the  end  of  each 
iteration.     At  this  time  a  new  conjugate 
direction  has  been  generated  and  we  have 
minimized  in  this  direction.  Our 
modification  proceeds  by  continuing  along 
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NUMBER  OF 


PROBLEM 

PROGRAM 

ITERATION 

FUNCTION 
EVALUATIONS 

FUNCTION 
VALUE 

ROSENBROCK 

POWELL 

12 
13 

142 
151 

6x10"^ 
7x10-1° 

ROSENBROCK 

OURS 

12 

141 

2x10 

13 

151 

3x10"'' 

POWELL  (1962) 

POWELL 

8 

126 

3x10-"* 

10 

148 

r 

8x10 

POWELL  (196  2) 

OURS 

8 

136 

_  c 

4x10 

10 

163 

2x10  ^ 

POWELL  (1964) 

POWELL 

5 

not  reported 

2.9731 

(function  of  3 

variables ) 

6 

not  reported 

3.0000 

POWELL  (196U) 

OURS' 

5 

102 

2.9998 

(function  of  3 

variables ) 

6 

.  113 

3  .0000 

TABLE  I:     Comparison  of  Two  Programs  to 
Demonstrate  Similar  Results. 
All  problems  taken  from  Powell 
(1964  )  . 


'■■Our  number  of  evaluations  is  adjusted  to  reflect  the  fact 
that  we  made  no  use  of  the  unit  second  derivative  as  Powell 
does.     This  leads  to  one  less  evaluation  along  any  direction 
previously  used . 


this  direction  some  number  of  units 
further  than  the  distance  we  have  just 
moved.     For  instance,   if  in  minimizing 
along  the  new  direction  we  had  moved  a 
distance  of  one-half  unit,  at  the  end  of 
the  iteration  we  take  what  we  term  a  leap 
to  a  new  point  in  that  direction  (usually 
with  a  worse  function  value) ,     The  length 
of  the  leap  is  called  a  leapsize  and 
indicates  how  many  units  we  use  as  a 
factor  to  multiply  the  current  direction 
vector  to  move  between  iterations.*  The 
effect  of  using  this  parameter  is  dynamic 
in  the  sense  that  it  depends  on  the  size 
of  the  direction  vectors  calculated  in  a 
given  iteration.     Thus,  as  the  search 
gets  closer  to  a  minimum  the  size  of 
leap  actually  taken  reduces  to  zero. 

Table  II  reports  the  results  of  our 
modification  on  Rosenbrock's    (1960)  two 
dimensional  problem: 

2  2  2 
Minimize  f(x^,X2)    =  100  (x2-x^  )   +    (l-x-j^)  . 

* 

In  our  program,  when  the  leapsize  para- 
meter is  set  to  1,   the  search  technique 
duplicates  Powell's.     If  leapsize  >  1, 
each  new  iteration  begins  at  a  point 
different  from  that  chosen  by  Powell's 
method . 


This  function  is  of  interest  because  it 
has  the  type  of  nonlinear  valley  we  have 
been  discussing.     In  Table  II  we  compare 
the  results  of  using  Powell's  method  with 
the  modifications  resulting  from  taking 
progressively  larger  leapsizes.     It  can  be 
seen  that  Powell's  unmodified  search 
technique  moves  rather  slowly  along  the 
nonlinear  valley  of  the  Rosenbrock 
function  while  as  expected,  the  use  of  the 
leap  moves  the  search  more  rapidly  along 
the  valley,   thus  in  general  effecting  a 
more  rapid  convergence  to  a  minimum. 

Tables  III  and  IV  display  the  effect 
of  leapsizes  for  a  three  and  a  four 
variable  problem  respectively  on  which 
test  results  are  also  reported  by  Powell 
(1964)  .     The  three  dimensional  problem  is: 


Maximize  f  =   ^  +  sin  {l/2Trxyz}  + 

l+(x-y)^ 

exp  {-(^  -2)2} 

while  the  four  dimensional  problem  is  of 
the  form: 
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Minimum 


Minimum 


Stari 


Start 


(a) 


FIGURE  I:     Possible  Improvement    in  Convergence 
(a)     Following  the  ridge;    (b)  Moving 
away  from  the  ridge. 


Minimum 


FIGURE  II:     Convergence  of  Powell's  Technique  and 

its  Modification  to  Solution  of  Rosenbrock 
Test  Problem.     Path  chosen  by  Powell 
indicated    by  dots;  modification  by  crosses. 
Dashed  line  indicates  continuation  of  movement 
made  by  the  modification. 
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LEAPSIZE  =   1  LEAPSIZE  "   2  LEAPSIZE  -  U 


ITi-RATION 

NUMBER  OF 
FUNC-i  ION 
EVALUATIONS 

FUNCTION 
VALUE 

NUMBER  OF 
r  UNCI i UN 
EVALUATIONS 

r  UI\  Li  iUN 
VALUE 

NUMBER  OF 

r  UN  CI  lUJM 

EVALUATION 

r  U  N  L  i  H 
VALUE 

1 

40 

2 

.764 

41 

2.639 

39 

1 .994 

2 

59 

2 

.'8  5  9 

58 

2  .540 

56 

2.629 

3 

73 

2 

.888 

74 

2.681 

73 

2.728 

88 

2 

.965 

90 

2  .  715 

89 

2.729 

5 

lOU 

2 

.983 

109 

2.976 

109 

2.825 

6 

120 

3 

.  OCO 

125 

2  .999 

128 

2.995 

7 

141 

3.000 

143 
158 

2.999 
3.000 

TALBE  III:     Solution  of  Three  Variable  Problem  with 
various  Leapsizes  TOL  -  10,    .1,  .0001 

MPASS  =  4 
MTEST  =  1 


LEAPSIZE 


LEAPSIZE  =  2 


NUMBER  OF 
FUNCTION 
ITERATION  EVALUATIONS 


NUMBER  OF 
FUNCTION  FUNCTION 


VALUE 


EVALUATIONS 


FUNCTION 
VALUE 


2 
4 
6 
8 
10 


54 
91 
129 
168 
203 


.  2040 

-.0075 

.0018 

4xl0~' 

2xl0~' 


55 
92 
133 
172 
209 


.  1401 
.  0021 
.0018 
6x10"' 
5x10"' 


LEAPSIZE 


LEAPSIZE 


ITERATION 


NUMBER  OF 
FUNCTION 
EVALUATIONS 


FUNCTION 
VALUE 


NUMBER  OF 
FUNCTION 
EVALUATIONS 


FUNCTION 
VALUE 


55 
95 
13  5 
174 


.1064 
.0012 
.0003 
4x10"^ 


56 
97 


.  1141 
9806  . 


TABLE  IV:     Solution  of   Four    Variable  Problem 
with  Various  Leapsizes 
TOL  =   10,    .1,  .0001 
MPASS  =  4 
MTEST  =  1 
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NUMBER  OF 


FUNCTION  FUNCTION 

ITERATION  EVALUATIONS  VALUE 

-1-  1+5  153.384 

2  69  82.772 

3  93  32.300 

4  117  15.084 

5  141  8.190 

6  164  .285 

7  186  .005 

8  206  .003 

9  226  3x10"' 
10  244  3xl0"~ 


TABLE  V:      Solution  of  Random  Trigonometric  Problem 
Using  Powell's  Method 

TOL  =  10,    .1,  .0001 

MPASS  =  4 
MTEST  =  1 
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2  2 
Minimize  f  =    (x^+lOx^)     +  5(x^-x^) 

+    ix^-2x^)^  +  lO(x^-x^)^ 

Inspection  of  the  tables  shows  that  the 
same  leapsize  is  not  necessarily  the  best 
for  every  problem.     In  Table  IV  a  leap- 
size  of  eight  moves  the  point  so  far  from 
the  ridge  that  it  cannot  be  attained 
again.     We  would  obviously  wish  to  lower 
leapsize  whenever  the  final  point  in  an 
iteration    (before  the  leap)    is  worse  than 
that  of  the  previous  iteration.   This  will, 
of  course,   assure  convergence  if  a  local 
minimum  is  also  a  global  minimum. 

Returning  now  to  the  three  dimen- 
sional results  reported  in  Table  III,  we 
see  that  leaps  do  not  improve  the  rate  of 
convergence  in  this  case.     We  interpret 
this  result  as  stemming  from  the  fact 
that  this  particular  function  approxi- 
mates a  quadratic  in  the  region  we  are 
searching.     Supporting  this  belief  is  the 
fact  that  the  first  complete  set  of  con- 
jugate directions  is  generated  at 
iteration  five,  whereupon  the  search 
converges  immediately  to  the  optimum. 

To  justify  our  interpretation,  we 
show  the  case  of  quadratic  convergence 
more  dramatically  for  a  function  which  is 
not  obviously  quadratic.     To  effect  this 
demonstration  we  use  the  bounded 
trigonometric  function  presented  by 
Fletcher  and  Powell    (1963).     In  this  case 
the  function  is: 


n  n 

Minimize  f  =     L     {   E     (A. .   sin  x.  + 
i=l     j=l  3 

2 

B .  .    cos  x . )    -  E  .  } 

i:         :  J 

where  the  A  and  B  matrices  are  random 
numbers  between  zero  and  100  and  the 
solutions  Xj  are  random  variables  between 

-IT  and  TT.       The  values  of  Ej  are  then 

determined  so  that  each  term  will  equal 
zero  at  the  optimum.     The  starting  point 
for  each  variable  is  randomized  between 
+   .Itt  and  -   .Itt  of  the  optimum  solution. 

Although  the  problem  is  seemingly 
nonguadratic ,   it  turns  out  to  be 
difficult  to  generate  a  random  problem 
which  is  not  essentially  quadratic. 
Figure  III  shows  the  function  sin  x.  + 

J 

cos  Xj  and  the  values  of  x^  for  what  we 

found  as  a  typically  generated  random  5 
dimensional  problem.      (Results  for  this 
problem  are  reported  in  Table  V.)  Note 
that  all  these  values  fall  in  the  linear 
portion  of  the  curve.     Although  the  values 
of  A  and  B  have  some  effect  on  the  shape 
of  this  curve  the  fact  that  there  are  five 
such  terms  in  each  squared  term  leads  us 


to  believe  that  the  nonlinear  effects  can 
be  balanced  against  each  other.  The 
resulting  function  is  then  approximately 
the  square  of  five  linear  terms  -  a 
quadratic.     We  would  expect  that  in  a 
conjugate  search  procedure  the  optimum 
would  be  found  as  soon  as  a  complete  set 
of  conjugate  directions  were  found.  As  can 
be  seen  in  Table  V  this  is  exactly  the 
case.     At  the  end  of  the  sixth  iteration  a 
complete  set  of  five  conjugate  directions 
has  been  found  and  used  -  at  this  point 
the  procedure  moves  nearly  instantaneously 
to  the  optimum.     No  leaps  are  displayed 
for  this  function  since  they  are,  of 
course,  not  going  to  improve  matters  any. 
We  might  conclude  from  this  experience 
that  perhaps  leaps  should  be  ignored  until 
after  the  first  set  of  conjugate  direc- 
tions has  been  developed.     The  study  of 
this  problem  is  interesting  particularly 
in  light  of  its  considerable  use  as  a  test 
function  for  nonlinear  problems.  The 
results  of  these  tests  will  tend  to  be 
biased  in  favor  of  quadratically  con- 
verging methods. 

For  our  final  test  in  this  series  we 
again  used  the  trigonometric  function  just 
discussed  but  this  time  choose  parameters 
so  that  the  solution  will  be  on  the  non- 
linear portion  of  the  curve  as  shown  in 
Figure  III.     Now  the  no  leap  method  for 
this  problem  took  435  evaluations  to 
achieve  a  locally  optimal  solution  of 
.6706,   as  compared  to  226  evaluations  to 
achieve  the  global  optimum  in  the  near- 
linear  case.     The  use  of  leaps  yields 
dramatic  improvement^Sv^in  the  operation  of 
the  code.     Table  VI  displays  the  results 
of  this  investigation.     Powell  reports 
solutions  to  two  five  dimensional  problems 
with  as  few  as  106  evaluations  -  which 
according  to  our  estimates  could  only  be 
achieved  if  the  function  were  perfectly 
quadratic  in  the  region  of  search.  This 
result  emphasizes  once  more  that  in  non- 
quadratic  cases  leaps  are  useful  while 
in  quadratic  cases  where  the  ridges  are 
linear  they  tend  to  be  of  little  or 
slightly  negative  value. 

In  the  next  section  we  explore  the 
effect  of  our  leapsize  procedure  on  some 
more  difficult  exponential  problems  for 
which  the  unmodified  conjugate  directions 
technique  has  failed  to  locate  an  optimum. 

IV     EFFECT  OF  TOLERANCES  ON 
SEARCH  PROCEDURE 

No  technique  is  impervious  to  the 
selection  of  program  parameters.     Not  only 
can  tolerance  parameters  affect  the  length 
of  time  taken  to  converge,  but  they  also 
can  determine  whether  a  particular 
technique  will  converge  at  all.     In  Table 
II  given  earlier  we  presented  the  results 
of  our  program  on  the  Rosenbrock  function. 
Tolerance  characteristics  are  fairly 
robust  for  this  problem.     Leaps  in  this 
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LEAPSIZE  =  1 
(POWELL) 


LEAPSIZE  =  1.5 


LEAPSIZE 


NUMBER  OF 

NUMBER  OF 

NUMBER  OF 

FUNCTION 

FUNCTION 

FUNCTION 

FUNCTION 

FUNCTION 

FUNCTION 

ITERATION 

EVALUATIONS 

VALUE 

EVALUATIONS 

VALUE 

EVALUATIONS 

VALUE 

1 

17  . 

154 

44 

17  .154 

44 

17 . 154 

2 

68 

7  . 

860 

69 

13.207 

69 

15  .  991 

3 

92 

2  . 

004 

9^ 

3.631 

94 

2.683 

U 

113 

1. 

967 

118 

2  .191 

119 

2  .459 

5 

134 

1 . 

967 

14]. 

1.325 

143 

2  .443 

5 

159 

1. 

953 

165 

.216 

167 

.463 

7 

182 

1. 

934 

188 

.  042 

191 

.065 

8 

206 

1. 

861 

211 

.001 

215 

.058 

9 

231 

1 . 

786 

233 

-4 

3x10 

238 

.009 

10 

252 

1. 

739 

2  53 

-4 

2x10 

260 

.001 

11 

275 

1. 

580 

274* 

-4 

2x10 

282 

-4 

2x10 

12 

298 

1. 

523 

3  0  5'"' 

7x10"^ 

13 

321 

1. 

440 

14 

342 

1. 

406 

15 

369 

943 

15 

396 

695 

17 

416 

671 

18 

435 

671 

19 

45  0" 

671 

TABLE  VI 

-  Solution 

For  Contrived 

Trigonometric  Function 

TOL  =   10,    .1,  . 

0001 

"Stopped 

when  no  variable 

changed  more 

than  .0001 

MPASS  =  4 

MTEST  =  1 


case  also  seem  to  be  robust  in  that  they 
improve  performance  in  all  instances. 

To  study  the  importance  of  toler- 
ances in  more  complex  circumstances  we 
used  a  problem  generated  by  Box    (1966) . 
In  Box's  notation  this  problem  is 

Minimize  f{a^,   3.^,   a^)  = 

^    ,      I  -a,x         -a„x,        ,   -X  -10x,i2 
[a^  (e     1     -  e     2   )    -    (e       -  e         )  ] 

where  the  summation  on  x  is  from  0.1  to 
1.0  in  increments  of  0.1.     This  problem 
has  infinitely  many  global  and  local 
optima.     We  choose  this  problem  since  Box 
reported  that  for  six  of  nine  starting 
points  Powell's  method  failed  to  converge 
to  any  of  the  global  optima.     Table  VII 
and  VIII  shows  the  results  for  one  of 


these  cases,  with  various  parameters. 
As  did  Box,  we  had  difficulty  getting 
solutions  for  any  of  these  problems 
when  no  leap  was  used,   although  we  did 
find  that  certain  tolerance  combinations 
achieved  solutions  better  than  that 
reported  by  Box.     We  also  found  that  in 
many  cases  scrapping  the  conjugate 
directions  from  time  to  time  and  re- 
starting the  search  using  coordinate 
directions  seemed  to  improve  the  rate  of 
convergence.     This  obviously  implies  that 
coordinate  directions  are  useful  for  this 
problem.     Use  of  leaps  improved  con- 
vergence in  all  cases.     In  some  cases 
convergence  to  one  of  the  global  optima 
occurred  through  the  use  of  leaps.  In 
almost  all  cases  using  a  leap  led  to  an 
improved  value  of  the  objective  function. 
To  improve  things  further  still,  we 
suspect  that  use  of  a  dynamically 


NUMBER  OF 


1 

TOL 
2 

3i'< 

MPASS 

FUNCTION 
LEAPSIZE  EVALUATIONS 

T7ATTIT7  r\Tr 

FINAL  SOLUTION 

10 

.  1  . 

0001 

3 

-L  (Powell) 

^  0  1 

3x10"^ 

10 

.  1 

0001 

3 

2 

118 

-9 

9x10 

10 

.1  . 

0001 

J-  Crowell; 

1  7  Q 

3x10"^ 

10 

.1  . 

0001 

4 

2 

140 

5x10"^ 

10 

.1  . 

0001 

5 

-1-  (,  r owe  1  i  ; 

7x10"^ 

10 

.1  . 

0001 

5 

2 

153 

4x10"^ 

10 

.1  . 

0001 

6 

1  (Powell) 

187 

3x10"'^ 

10 

.1  . 

0001 

6 

2 

162 

2x10"^ 

10 

.01 

.  0001 

4 

1  (Powell) 

313 

1x10"^ 

10 

.01 

.0001 

4 

2 

191 

8x10" 

10     1   .0001  4  1  J    Found  no  ,min  in  any  direction  at  the  1st  iteration 

>  using  this  large  a  step. 
10 .    1   .0001  4  2  j 


1  =  Upper  bound  on  TABLE  VII:     Results  From  Three  Dimensional   (Box,   1966;   Start  2) 

step  size  Exponential  MTEST  =  1 

2  =   Step  length  for 

]          first  quadratic  fit 
in  any.  direction 

^3  =  Accuracy  of  linear 
search 


changing  leapsize  would  be  advisable.  If 
a  leap  produced  a  better  solution  along 
some  direction  we  should  continue  to  leap 
until  a  worse  solution  is  found  before 
continuing  on  to  the  next  iteration.  When- 
ever we  performed  this  operation  we 
achieved  convergence  with  Powell's  code. 

The  state  of  affairs  just  reported 
points  out  a  problem  with  the  one- 
dimensional  quadratic  fit  used  both  by 
Powell  and  by  ourselves.     The  surface 
seems  to  be  so  flat  along  conjugate 
directions  that  the  linear  search  stops 
far  from  the  optimum  along  the  line.  Thus 
it  is  difficult  to  say  whether  conjugate 
directions  with  a  different  line  search 
routine  would  not  work  with  this  type  of 
problem.     At  any  rate  it  is  encouraging 
to  note  that  use  of  leaps  not  only 
improves  speed  of  convergence  but  has  the 
effect  of  creating  a  more  robust  search 
procedure . 


V     USEFULNESS  OF   INTERACTIVE  COMPUTING 
IN  THE  EXPERIMENTS 

The  line  of  research  presented  in  this 
paper  would  have  been  nearly  impossible 
to  conduct  without  using  a  highly  flexible 
interactive  code  such  as  APL.     In  many 
ways  the  study  of  optimization  is  typical 
of  highly  complex  search,  evaluation  and 
decision  making  tasks,  where  a  great  deal 
of  information  both  useful  and  spurious  is 
available  at  some  cost.     Gathering  and 
presenting  this  information  to  the 
decision  maker  may  be  at  least  as 
difficult  as  understanding  the  theoretical 
and  analytical  aspects  of  the  problem. 
The  interactive  facilities  of  APL  helped 
in  numerous  ways  in  organizing  these  tasks. 
First,  we  found  it  easy  to  develop  sub- 
routines whose  performance  we  could  study 
in  detail  prior  to  including  them  in  our 
main  search  routine.     Both  debugging  and 
operation  of  these  routines  could  be 
carried  out  on  an  almost  instantaneous 
basis  because  of  the  language's  inter- 
active capabilities  and  its  ability  to 
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NUMBER  OF  FUNCTION 
EVALUATIONS  TO  ACCURACY 


VALUE  OF 


L 

2 

TOL 
3  * 

MPASS 

LEAPSIZE  OF 

.  0001 

FINAL  SOLUTIONS 

■0 

.1 

.  0001 

4 

1  (Powell) 

168 

.147 

.0 

.  1 

.  0001 

4 

2 

235 

.009 

.1 

.  0001 

4 

239 

6x10"'^ 

lo 

.1 

.  0001 

4 

16 

195 

2x10"'^ 

LO 

.05 

.0001 

4 

1  (Powell) 

147 

.149 

jLO 

.05 

.0001 

4 

4 

145 

.149 

10 

.05 

.0001 

4 

16 

103 

.009"  then 

diverged 

10 

.5 

.0001 

4 

1  (Powell) 

55 

.022 

10 

.  5 

.0001 

4 

2 

71 

.018 

TABLE  VIII: 


,1  =  Upper  bound  on 
step  size 

2  =  Step  length  for 

first  quadratic  fit 
in  any  direction 

^3  =  Accuracy  of  linear 
search 


monitor  in  detail  the  functioning  of  these 
subroutines.     Second,   as  through  exper- 
ience we  learned  which  numerical  outputs 
from  a  given  routine  were  important  to 
monitor,   the  editing  capabilities  of  APL 
allowed  us  rapidly  to  suppress 
inessential  information.     Third,  when 
an  hypothesis  regarding  the  operation  of 
some  part  of  the  search  routine  was 
formulated,   the  language  allowed  us 
instantaneously  to  test  that  hypothesis 
under  controlled  experimental  conditions 
in  which  possibly  compounding  variables 
could  be  held  constant.     Fourth,  the 
ease  with  which  subroutines  could  be 
programmed  and  operated  independently 
allowed  us,   in  effect  to  develop  and 
utilize  tools  for  conducting  these 
experiments,   thus  helping  us  to 
ascertain  certain  problem  characteristics 
as  our  study  proceeded.     Finally,  when 
hypotheses  were  shown  by  controlled 
experiments  to  be  incorrect,  the  editing 
and  memory  capabilities  of  the  language 
made  it  easy  to  backtrack  and  hence 
recover  successful  approaches  before 
conducting  new  experiments. 

VI  CONCLUSIONS 

In  this  paper  we  considered  the  per- 
formance of  Powell's  conjugate  direction 
method,   as  originally  developed  and  as 
modified  in  the  manner  indicated.  Very 
few  problems  could  be  found  which  gave; 
Powell's  original  technique  difficulty  in 
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finally  converging  to  a  solution.  In 
those  cases,   restarting  the  directions  or 
some  other  interactive  parameter  change 
would  always  start  the  search  moving 
again  -  in  this  sense  Powell's  technique 
never  failed  to  converge.     However  when 
we  introduced  the  concept  of  leapsizes, 
the  modification  demonstrated  a  beneficial 
effect  on  speed  of  convergence  and  the 
robustness  of  convergence  for  the 
technique  in  particular  for  cases  where 
the  functions  exhibit  nonlinear  valleys. 

It  should  be  said  that  the  concept  of 
leapsizes  is  not  a  new  one.     For  instance, 
Rosenbrock   (1960)   used  this  idea.  His 
usage  of  leaps,  however,  differed  from 
ours  in  two  ways.     First  he  used  leaps 
with  a  steepest  descent  method.  Second, 
he  leaped  only  in  an  attempt  to  improve 
the  value  of  the  objective  function  -  not 
to  move  away  from  the  ridge  itself  as  we 
do.     This  is  a  very  important  distinction. 
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APPENDIX:      LISTING  OF  CODE 


Tuir,  IT.  FMPinyn  pouf.llt.  cnujuoATF  codfdtnatf  ffapcv.  y.TTn  vaiuahlf  lfapfizf, 

A  PAPAMFTFR  DFr.TGNFP  TO  PFPfUT  CIIANCTNC  TlIF  POIllT  AT  yPICU  TllF  IIFXT  T'tfPATJOU 

of  thf  r.FAPCU  TFcntiToiiF  nFCincr..  tpf  dirfctton  of  thf  lfap  rr;  tpat  of  tuf  last 
succFr.npiiL  conjugatf  coorpinatf  ffapcu  ik  At;  ttfrattop. 

t'.AIN  PROGRAM  TP,  OPFRATFD  UFIHG         ITFP  ,INIT  ,LEAPr,IZF  !'ACR_  P07NT  ,FNVALUF 
GLOBAL  PARAMFTFRP,  FOR  TUIG  VP  ARE  I'PAPP   (MAX  FVALVATTONP  PFP  PAPP), 
HTFPT  (MAXPTPPPIZF  RFPUCTION  BFCAIPF  OF  NO  TJWTCATPP  yiNIf-'UM) 
TOL-'-MAXPTFPPIZF  ,  PTFPPIZF  ,  ALLOWABLE  TOLFRAPCE 

COUNT  (SET  TO  ZERO  WHEN  REG INNING  RUN;  GIVFP  NUf'BFP  OF  FUNCTION  VALUPP  COMPUTED 
ZM.IN  GIVES  MIN  FNVALUE  FOUPP  TO  DATE 

XX  GIVES  DIRECTIONS  EMPLOYED  IN  TUF  CURRENT  ITERATION   (DELETED  DIRECTIONS 
MAY  BE  REPORTED,   BUT  ARE  NOT  CALLE  DFOR  IN  THE  ROUTINES  AS  PPOGPAMMED) . 
DATA    IIVES  HISTORY  OF  THE  SEARCH 


VMACRiUlV 

V  NTIM  MACR  INI  iDIRE  ;POIN      xN  I  xPFDU  ;ORT  ;YY  xDIM  iDIMM 

■[1]  (\INPUTS  ARE  NTIM  MACR  IN  IT  POINT. 

[2]  fl  TOL  MUST  ALWAYS  BE  DEFINED  BEFORE  RUNNING  TP  IS  PPOGPAM 

[3]  <\XX  AND  DATA  MATRICES  MUST  BE  DEFINED  IF     NTIM[  21^1 

[4]  DIMM->-pINI 

[5]  DIM-^DIMM-1 

[61  A'J-<-0 

[7]  -♦/IQilx  ,;/rj//[2]  =  l 

rs]  DATA-^d  ,DTMM)pTNI 

[91  XX*-(\DIM^o  .'=\PIM 

[101  DATA*-DATA  ,\  XMNI  SEEK  XX 

[in  NI^O 

[12  1  ADA:DATA^PATA  ,  FlK  ,  (  ~1  ,DIM!n^DATA  )PFFX(  ( -DIM)  ,Dir)fXX 

[131  JK-^-l^oDATA 

[lul  DIRE-^DIM^  ,DATA[.JM;  1- DATA[  JM- NTIM[.2l+D  IM  ;  1 

[15  1  PA  TAi-DA  TA,[  11,  LMIN  (  ,  DA  TA  [  JM ;  1  )  .  DIRE 

[161  POIN-^DIMt(NTIM\.?,']y^  ,  (~  1  ,DIMM)  ^DATA  )-(NTI!'\  ll-l  )x  ,DATA'^  JM 

[17  1  DATA-^DATA  JAIPOIN  ,EVAL>^POIN 

[181  .V7'J/-*[  21^1 

[19  1  YY-^a-DIM)  ,DIM)^XX 

[201  ORTi-(yyHDiM,  1  )p(  +  /yrxyr)*o.  5 )  +  .  ^dire 

[211  OPT*- \  OPT 

[221  ORT->-{(ORT=[ /ORT)y<\pORT)-0 

[231  RFDU->-(l*oXX)ol 

[2Ul  REDin  ((^PEDU)+ORT-DI!n-^0 

[2S1  XX*-REDU^XX 

[26]  XX-^XX  ,^1^DIRE 

[271  NI-^NI+1 

[28  1  ('7 ,DIMM)^DATA 

[291  'ooo        >; 'COUNT  =       COUNT;'        ooo ' 

[301  -*ny  \NI>NTIM[.1'[ 

[311  -^AM 


■JSEEnmv 
V  RSS->-PFV  SEEK  XX  ;IM;I  ;SSMP 
[11  IMi-l^oXX 
[21  7-1 

[31       RSS*-{  \  ,DIMM)oPFV_ 

[41     AAA'.SSMP^,I,MIN(  .  (  ~  \  ,D  IMM)  ^  RSS  )  ,XX[  I;! 

[51       RSS-<-RSS  .[AISSMP 

[61  7-7+1 

[71       -e22x  i7>7/< 

[81  -*AAA 

[0]      £2£:A\T/;->-l  OiRSS 
V 
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7  nnR*-LMIN  X;A;P;C:D;7.A;Zn;7.C:7.D  -.DEN  ;  ZZ  ;  7, ;  ZZZ;  T  T  ;f't' x^ZZ  \nTF.P       rr  xTEr?  ■,PIIO 

II  t^USEn  A   QIIADRATTC  RECPEnr.Tnt!  ON     TI'PEE  POIflTn  TO  PREDTCT  A  t'ltlTNUM 

21      OijtiPiiTn  ARE  r>TEP\.^  cnnRpr']  and  x\pniuf,FH  value,  and  PTRECTTON'] 

31        f^r;TEP[Al=MAXETEPr7ZE  ,nTEP[  21=ETEPfrTZE,rTEP^2l  =  ALr,nirAPLE  TOLERANCE 
Ul  STEP-<-TOL 

5]  PAnn*-! 

Gl  TEr^T-^Q 

71       t1M--<ApX) -1)^2 

nl  ,hJ:A-0 

9]  IpO 

101  JJ-t-O 

III  .'?*-.':rf:?r  21 

121  ,r  •.ZA-^X\ MM^l"] 

:i3i    ;5;?-t-;7U/i/:«)(/w+x  )+Bx( 

lU]  ■*G^\ZB>ZA 

15  1  r-H2x/7yj^pr2i 

161    zc->-;7y/iL«)(/7//+;r)+rx(  -AW)+x 

171     R*-{{R  ,A  ,ZA)  ,B  ,ZB)  ,C  ,ZC 

181  i:DEN*-(.  (.P.-C)^ZA)^{{C-A)>^Zr.)-^(  A-B)^ZC 

191     /^■<-(((/?*2)-r*2)xZ/l)  +  (((C*2)->l*2)xZB)  +  ((/l*2)-.'?*2)x.'^C 
:2n]     l)-'-(PxO.  5)tDF// 

:2i]    -»Wx,(  |/))>.';7'/:pr  1] 

;22l     ■*KKKx\Q<nENi  {A  -  B)^(B-r)yC-A 

:  23  1     AfZZ-(  (  W.^r/Tpr  31  )  ,  (  (  \D-B)<nTEP[  3l)  ,  (  | <.''r.^?r  3  ] 

241  -*Ly\i+/tlZZ)>0 
:25l  ?/l/7,'?*-p/l,9.'7+l 
:  2  fi  1  X  I P/5  .T.T  ^//P/l  ,'75 

'271  N:Zn'>-EVALfi(MfHX)+P>^(-fM)^X 

2R1  P*-R,D,Z1) 
.2°i^     ZZ*-ZA  ,ZB  ,ZC  ,ZP 

301    7z;^-<-(  r/.'^z  )  = 

;311     -J-^TTx  I  ( +/,':z;^ )  >i 

321  r7'/?:->-Jx  i4  =  +  /?;;^Z  =  0   0   0  1 

33  1  ZZZ<r~ZZ7. 
:3itl  ZZ*-ZZZ/ZZ 

351     ^-^ZZZ/zl  ,/?,C,/) 

3B1  .^/i-c-zzrii 

371  Z,'?-^ZZr2l 
38  1  ^C-<-ZZ[3l 
391  .1*Z[ll 
401  B^^Z\2^ 
:hii  r-HZ[3l 

4  21  -►/.' 

:43l  a  •.C*--r^TEP\.2'\ 

:  4  4 1    z  c-p  v,i  /:  <}  ( + X ) + r  X  (  -/.^a'  )  t  x 

:45l     R-^{{R  ,A  ,ZA)  ,B  ,ZB)  ,C  ,ZC 
'  ■*}'. 

:47l  KKK-.'TErT  JNPICATEf^  NO  MI  NI  fUf-i  ;FTEP[.  2l'>-r:TEP[  2 10  ' 
;4  8l  TEST-<-TE,'^T+\ 

ioi    .':r/7pr  2  i-H.'TT'rPC  21?  10 
i  1 1  ■>-,r,r,r 

i2l  /,:■*/,;, 

i3l      'VALUE  ir        '  ;((l  /Z/l,Z,'?,ZO  =  Z/1,Z/?.ZC)/Z/1.ZH,ZC 

54  1  LL:PHO-^{  \  IZA  ,Z/?,Zr)=Z/  ,7?.  ,ZC 

55  1 

561    '/r     •  ;(A^Af+/U(((p/fPxi3)  =  r/P/;nxi3)/yi,/?.r)x(-A70+Ar 

;71  LLL\ 

ial     RRR*-U'MiX)-¥(.  (  (P/Wx  ,3)  =  r/P/'(7x  I  3)//1  ..T.OxC  -A'/MtX 

5  9  1     RRRfRRR  ,  (  (  PZ/P x  ,  3  1  =  f  /P/.'(7 x  l  3  )  /  Z/1  ,  Z/? ,  ZC 

6  01  0 
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[ni]  I :  '  PPEPICTEP  M7N  LARGER  THAN  KtlOV!!  VALUFH;   nTFP[  2'\*-r;rEP[.  21  i  10  ' 
[62]  TEnT*-TF.PT^\ 
[0:^1  '*Lx\TEr,T>.MTErT 
[64  1     nTEP\  2l*-r.TEP\  7liX0 
[051  ■*JJJ 

[661      :/?-(- .<7J';-;P[llxx/5- /I 
[67] 

[68]  -*IU1*\JI>1 
[6  9]  ?/15.T-^P/l.T.':+1. 
[70] 

[71]  ntl-.'MAX  I11CREMEHT  HHEP  TWICE -.MAXETEPriZE  POUELED' 

[72]     .'TT/':?[  l]i- ,'7r^'P[  l]x  2 

[73]  ■JJJ:Pnn*-(.\./7.A,7,r.,ZC)^7.A,Zn,7.C 

[74]     RRR<-(t'M\X)-\-{  (  (p;?Pxi3)  =  [/p;/0x  i3  ) //I  .  H  ,  C  )  x  (  -  W/,/ ) 
[75]     RRR-f-RRR  ,(.  (PHfl^^  \  3)=[ /PIlOx  \  3  >/Z>1  ,  Z^,;^(::■ 
[  7  6  ]     RRR*- ,  RRR 
[77]     X*-RRR  ,{-MM)^X 
[78]      ->c7  J 

[79]  .TTT  :  ZZZ-f-C  i4)xZZZ 
[80]      ZZZ-^(  [ /ZZZ)  =  ZZZ 

[8  1]  -^rr/?, 

V 


VEK/lL[n]V 

V  P-KK^/.  /;.r;ZZZ 

[I]  f^THIP  FUnCTIOE  COMP'UTEE   VALUFE  OF  A   TUREF^PJl'EtlEIONAL  FXPOEFIITI AL^. 

[2]  zzz-<-o .  ix  I  in 

[3]  P-!-(  pX[  1  ;  ]  )pO 

[4]  J-Hl 

[5]  .■?[J]-H  +  /(((*-Xri;7]xZZZ)-*-X[2;I]xZZZ)-X[3;J]x(*-ZZZ)-*-10xZZZ)*2 

[6]  ->8x  iZf/J//<P[7] 

[7]  ZMIN<-Rlll 

[R]  T-s-J+l 

[9]  COiniT<-COUtJT+l 

[10]  ->0x  ,  J>pX[  1  ;  ] 

[II]  ->5 
V 


piEXAypLE  OF  PEOCRAM  OPEFATIOH 


a  nil  NT ,  TOL ,  wp,i  .'T.T ,  trrEr? ,  beg  in 

0      10      0.1      0.0001      4      1      2.5      10      10  275.881 

10    1  AM  CP  BEG  in 
TEST  inUICA  TEE,  HO  I'lUIt'UM ;  ETEPl  2  ]  +-rrPPr  2  ]  M  0 

PREPICTEP  MIN  LARGER  THAN  VJIOWN  ^^ALUEE;   ETEP[  2l<-ETEP[  21  rio 

TEST  INPICATEE  I!0  Minnni^^  ;ETEP[  21<-ETEP[  2H10 
2.15424382  16.44652126  0.6601690751  0.1639626634 

2.15424382  16.44  6  5212  6  0.6601600751  0.1639626634 

ooo       COUNT  =   22  ooo 

PREPICTEP  I'lN  LARGER  TUAN  KNOm   I'ALIIEE;    ETFP\  2l-^-nTFP\  21  i  10 
TEET  INPICATEE  NO  f'l  NIMUf  iETFP[  2']-<-ETEP[  2 ']  i  1  0 

PREPICTEP  ."IN  LARGER  THAN  KNOWN  VALVEE;   ETEP\  2l->-ETEP\  71 -.10 
2.105092584  16.30252126  0.6716643134  0.1582622029 

2.10  5  09258  4  16.302  5  212  6  0.6  716  6  4313  4  0.15  82  6  22029 

OOO       COUNT   -    3  3  OOO 
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SELECTION  AND  EVALUATION 


Chairman:       A.C.    Williams,  Mobil  Oil  Corp. 

Panelists:     J.   Gregory  Colahan,   General  Battery  Corp. 
J.R.   Ellison,  Mobil  Oil  Corp. 
Milton  M.  Gutterman,    Std.   of  Indiana 
David  Hirschfeld,  Management  Science  Systems 

Summary  of  panel  discussion  by  A.C.  Williams. 


How  does  an  organization  select  a 
mathematical  programming   software  system 
for  applications  in  a  commercial  environ- 
ment?    What  are  the  relevant  factors  to  be 
considered,    how  can  they  be  evaluated,  and 
how  should  they  be  weighted  in  arriving  at 
a  conclusion.     These  are  the  questions  ad- 
dressed by  the  panel.     On  almost  all  mat- 
ters,   there  seemed  to  be  a  consensus  of 
opinion,   even  though  the  panel  members 
came  to  the  questions  from  a  number  of 
different  perspectives.     This  note  is  an 
attempt  to  summarize  that  consensus. 

As  in  so  many  other   instances  of 
work-a-day  system  evaluations,   there  are 
three  aspects  to  be  considered:      (i)  the 
functional,    (ii)    the  technical,   and  (iii) 
the  economics.     These  are  listed  in  the 
approximate  order  of  importance. 

Functional  concerns,   of  course,  begin 
with  asking  whether  the  particular  system 
under  consideration  will  meet  job  require- 
ments,   both  now  and  in  the  foreseeable 
future.     This  requires  an  evaluation  not 
only  of  the  system  being  offered,    but  of 
its  future  possible  path  of  evolvement. 
And  to  evaluate  that  requires  that  some 
hard  questions  be  asked  about  the  vendor. 
Will  he  remain  in  business?     If  he  does, 
will  he  maintain  it  and  keep  it  up  to  date, 
and  is  he  likely  to  have  it  evolve  in  a 
direction  suitable  to  our  needs? 

It  is  important  to  realize  that  the 
choice  of  a  mathematical  programming 
system  is  likely  to  influence  and  perhaps 
to  influence  greatly,   the  future  develop- 


ment of  applications  in  the  company.  If 
a  code  has  a  good  GUB  feature,   large  and 
comprehensive  distribution  models  can  be 
developed  and  used.     A  general  purpose 
MIP  feature  could  allow  development  in 
directions  which  would  otherwise  not  be 
possible.     The  same  could  be  said  of  a 
quadratic  programming  option,   network  flow 
subroutines,    special  purpose  MIP's,  a 
separable  programming  option,  etc. 

An  important  functional  question  to 
be  asked  of  a  proposed  system  is  how  well 
it  can  be  integrated  into  the  company' s 
operating  procedures.     Especially  in  in- 
stances where  results  are  needed  on  a 
real  time  basis,   as  in  some  day  to  day 
refinery  operations,   we  need  satisfactory 
answers  to  these  questions: 

(i)  .     Does  the  system  facilitate  getting 
data  and  maintaining  it? 

(ii)  .     Will  it  be  easy  to  ask  questions 
and  get  answers    (are  the  matrix  generators 
and  the  report  writers  satisfactory?). 

(iii)  ,     Will  the  system  be  reliable? 

The  question  of  reliability  gets  us 
into  the  technical  questions.  Reliability 
is  the  question  of  how  bug-free  is  the 
code,   and  how  stable  is  the  code,  (sta- 
bility here  refers  to  the  code's  ability 
to  handle  numerical  difficulties  arising 
from  almost   singularities  and  from  degen- 
eracies) .     It  is  generally  conceded  that 
no  code  is  bug-free  or  stable  for  all 
problems,   and  that  therefore  applications 
systems  have  to  be  designed  to  take  those 
facts  into  account.     A  preliminary  reading 
on  the  reliability  of  a  system  can  often 
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be  obtaining  be  asking  around.  The 
vendor  will  often  supply  a  list  of  satis- 
fied customers.     However,    it  should  be 
borne  in  mind  that  his  list  of  all  cust- 
omers and  his  list  of   satisfied  customers 
may  not  be  identical. 

Many  of  the  "old  hands"  keep  a  num- 
ber of  test  problems,   carefully  safe- 
guarded from  any  changes,   to  be  brought 
forth  to  exercise  a  proposed  code. 
Usually,   there  will  be  at  least  one  test 
problem  that  will  upset  the  stability  of 
any  code  -  the  point  being  to  see  how 
long  it  takes  to  bring  it  down,   and  how 
gracefully  it  does  down  when  it  finally 
surrender  s . 

The  test  problems  should,   of  course, 
contain  representative  problems,  typical 
of  those  to  be  solved  in  production.  Most 
users  are  not  surprised  if  their  repertoire 
of  problems  turn  up  some  bugs  in  a  code 
being  tested  by  them  for  the  first  time. 
The  real  questions  are:     how  does  the 
vendor  respond.     Does  he  give  you  a  work- 
around?    Do  new  bugs  continue  to  appear, 
or  does  the  first   spate  of  bugs  appear  to 
be  all  of  them? 

Bugs  and  instabilities  will  persist 
in  every  code,   and  they  will  have  to  be 
dealt  with.     An  important  technical  prop- 
erty to  be  looked  for  therefore,    is  flex- 
ibility to  adjust  tolerances,  conveniently 
and  easily;   and  to  control  the  flow  of  the 
solution,   by  e.g.,   adjusting  inversion 
frequency,   level  of  multiple  pricing  and 
other  parameters. 

While  the  local  l.p.   technical  person 
wants  and  needs  such  controls,    in  many 
instances  he  wants  to  be  able  to  shield 
the  users  from  worrying  about  them. 
(Some  would  say  the  user  shouldn't  even 
know  about  them,   but  this  appears  to  be 
an  extreme  view.)     Therefore,   the  ease 
with  which  the  system  control  language 
allows  macros  to  be  written  and  executed 
is  an  important  feature  in  most  installa- 
tions. 

In  many  installations  the  ability  to 
do  recursion  would  be  an  important  prop- 
erty to  be  looked  for  in  a  code.      In  this 
case,    it  would  be  important  to  be  able  to 
read  solutions,   to  read  the  shadow  prices, 
to  analyse  them,   and  to  revise  the  coeffi- 
cients conveniently  and  easily  in  a  higher 
level  language,    such  as  Fortran  or  PL/1. 

The  economics  is  the  matter  of  cost 
effectiveness.     Benchmark  runs  have  to  be 
designed,   run,   and  analysed.     Some  of  the 
benchmark  test  problems  will  be  for  the 
purpose  of  exercising  the  code,  i.e., 
testing   it  for  bugs  and  for  its  stability. 
In  testing  for  cost  effectiveness,  however, 
actual  production  conditions  should  be 
simulated  as  nearly  as  possible.  Actual 
production  models  should  be  used,  if 
available.      If  most  production  runs  are 


made  from  an  advanced  starting  basis, 
with  a  few  revised  elements,   then  the  code 
should  be  tested  under  those  conditions. 
The  performance  of  the  code  in  solving  a 
problem  starting  from  scratch  is  almost 
irrelevant.      If  most  production  runs  are 
made  during  prime  shift  in  a  multi- 
programmed  environment,   then  the  test  of 
the  code  should  be  done  during  prime 
shift  in  a  multiprogrammed  environment. 
Of  course,   results  will  vary  from  run  to 
run,    so  that  several  runs  for  each  case 
will  generally  have  to  be  made  to  get  an 
average  and  a  variance. 

Computation  associated  with  the 
mathematical  programming  activities  can 
generally  be  broken  down  into    (i)  matrix 
generation,    (ii)    "execution,"  and  (iii) 
report  writing.     The   "execution"  can 
further  be  broken  down  into  "primal"  and 
"other"    (typically  converting  the  input 
data  into  the  formats  to  be  used  by  the 
code,    e.g.    SET  UP,  CONVERT). 

For  at  least  one  production  shop  it 
was  reported  that  the  execution  step 
accounts  for  only  about  60%  of  the  total 
CPU  time  charged  to  linear  programming, 
the  other  4  0%  being  taken  up  by  matrix 
generation  and  report  writing.     And  only 
a  total  of  40%  was  spent  in  primal,  i.e., 
the  actual  execution  of  the  simplex 
method  calculations.     There   seemed  to  be 
some  agreement  that  these  kind  of  pro- 
portions were  not  atypical.     The  con- 
clusion is,   of  course,   that  the  matrix 
generator  and  report  writers  cost  per- 
formance can  be  significant  and  important, 
and  should  be  included  in  the  benchmark 
tests. 

What  to  do  about  tuning  is  a  major 
headache  in  benchmarking  codes.  The 
amount  of  main  memory  assigned  can  have 
a  major  effect  on  the  performance  of  the 
code,   and,   of  course,   an  inverse  effect 
on  the  other  work  running   in  the  computer. 
It  is  generally  recommended  that  a  number 
of  different  memory  size  assignments  be 
tested  for  the  major  application  types, 
and  that,   on  the  basis  of  that  data,  a 
recommended  size  be  arrived  at  by  use  of 
the  charge  out  system  and/or  negotiation 
with  the  computer  center. 

How  much  tuning  of  the  parameters  of 
the  code  itself   should  be  done  is  a  diffi- 
cult question.     Which  parameters  are  con- 
sidered most  important  are,   of  course, 
initially  best  determined  by  consultation 
with  the  vendor  or  with  other  users.  If 
the  range  of  different  problem  types  to 
be  solved  is  fairly  narrow,    it  can  pay  to 
spend  considerable  time  and  energy  to  get 
good  parameter  settings,   whereas  if  the 
problem  set  is  very  variable  and  subject 
to  change,    it  may  be  that  one  should  not 
go  much  beyond  using  the  default  settings 
suggested  by  the  vendor. 

Overall,    the  benchmark  problem  is 
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difficult  because  of  the  many  variables 
involved,   and  the  fact  that  run  results 
are  themselves  variable.     Unless  great 
care  is  taken  in  designing  the  tests,  the 
number  of  runs  and  the  consequent  amount 
of  computer  resource  used  can  get  out  of 
hand.     And,   unless  great  care   is  taken  in 
interpreting  and  analysing  the  results, 
wrong  conclusions  can  be  reached. 

In  summary,   any  organization  with  a 
significant  present  or  contemplated  mathe- 
matical programming  work-load  will  find  it 
worthwhile  to  exercise  a  good  deal  of  care 
in  the  selection  of  a  software  system. 
The  issues  discussed  in  this  note  are 
probably  the  issues  that  the  organization 
needs  resolved.     But,  more  important  than 
any  specific  issue  is  to  assign  the  eval- 
uation task  a  reasonable  priority  and 
budget,   and  to  assign  a  good  person  to  the 
task. 


Attached  are  tables  which  may  be  usee 
as  general  guidelines  to  some  of  the 
mathematical  programming   systems  and 
matrix  generators/  report  writers.  The 
tables  are  based  on  the  best  information 
that  was  available  at  the  time  they  were 
prepared.     New  systems  and  new  features 
are  continually  being   brought  out,  however 
so  that  any  potential  user  would  have  to 
obtain  the  most  up  to  date  information. 
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EXPERIENCES  IN  THE   DEVELOPHENT   OF  A  LARGE 
SCALE   LINEAR   PROGRAntllNG  SYSTEPI 

Robert  J •   S joquist 
Control   Data  Corporation 
^B01  North  Lexington  Avenue 
St-   Pauli  Minnesota  SSllS 


ABSTRACT 

There  has  been  increasing  interest  in  software  development  techniques  over  the  past  few 
years-     Effective  techniques  for  developing  reliable-i  maintainable  and  extendable 
software  significantly  reduce  total   software  costs-     Several  techniques  have  been 
described  in  recent  literature-     The  objective  of  this  paper   is  to  report  experience  with 
some  of  these  techniques  in  the  development  of  APEX-III-i   a  large  scale  linear  programming 
system  - 

Several  of  the  techniques  are  felt  to  contribute  to  a  successful   development-  They 
include  a  project  staff  organization  similar  to  a  "chief  programmer  team"-i  documentation 
and  coding  standards-i   and  open  programming- 

Techniques  which  did  not  appear  directly  applicable  to  the  development  were  structured 
programming  and  top-down  programming- 

Finallyi   it  is  noted  that  a  significant  amount  of  effort  is  required  to  prepare  accurate 
estimates  of  development  costs- 


INTRODUCTION 

Last  year-i  our  company  completed  develop- 
ment of   a  large  scale  linear  programming 
system-i   APEX-III-     Several  recently 
publicized  mathematical   programming  and 
software  development  techniques  were  con- 
sidered and  used  in  the  development-  Over 
the  past  year-i  we  have  had  the  opportunity 
to  evaluate  the  effectiveness  of  these 
techniques  - 

A  development  of  this  kind  requires  two 
distinct  types  of  skillsn  mathematical 
programming  skills  and  software  develop- 
ment skills-     By  mathematical  programming 
skillsn  we  mean  those  skills  unique  to  a 
linear  programming  system-     They  include 
a  knowledge  of  the  type  of  problems  to  be 
solved-     An  efficient  design  considers 
model  characteristics  such  as  size-i 
density  and  structure-     Familiarity  with 
characteristics  of  current  algorithms  is 
needed  to  properly  match  the  algorithms 
to  the  computer-     Performance  and 
numerical   stability  of  a  code  reflect  the 
programmer's  understanding  of  the  algor- 
ithm • 

Just  as  important  as  the  mathematical 
programming  skillsn   are  the  software 
development  skills-     These  are  the  skills 


required  for  development  of  any  large 
programming  system-     They  include  a  know- 
ledge of  the  computer-i  the  operating 
system-i  the  compilers  and  assemblers- 
They  include  a  familiarity  with  data 
handling  techniques  and  means  for 
efficient  use  of  the  hardware-  These 
skills  also  include  a  working  knowledge 
of  techniques  for  producing  reliable  and 
maintainable  software  products- 

The  purpose  of  this  paper  is  to  report 
our  experience  with  these  software 
development  techniques- 


THE  DEVELOPMENT  PROJECT 


Ibi 


The  development 
year  period  and 


occurred  over  a  foui — 
in  two  distinct  phases- 
The  first  was  the  development  of  an 
in-core-i  no  frillsn   linear  programming 
system-     Five  programmers  completed 
the  development  in  less  than  one  year 
The  code  was  refined  over  a  two-year 
period  by  one  programmer  working  on 
a  casual  basis- 

The  second  development  phase  involved 
adding  several  major  features  to  the 
system-     These  included  an  out-of-core 
capability-i  mixed  integer  and  parametric 
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programmingT   and  a  matrix  reduction 
facility-     The  product  was  required  to 
operate  on  three  series  of  computers  and 
run  under  three  different  operating 
systems-     This  development  phase  involved 
as  many  as  IS  programmers  over  a  1-1/2 
year  period-     The  objective  of  each  phase 
was  to  produce  a  commercially  profitable 
software  product- 


THE  DEVELOPriENT  TECHNIQUES 

Project  staff  organization  was  similar  to 
the  "chief  programmer  team"  described  by 
Baker  and  Flills  -[1>-     The  chief  programmer 
team  consists  of  a  chief  programmern  a 
backup  programmer-i   a  project  secretary-i 
and  a  pool  of  programmers- 

The  chief  programmer  is  generally  the  most 
senior  member  of  the  development  team-  He 
is  a  working  member  of  the  team  with  over- 
all technical  responsibility  for  the 
project-     On  this  projectn   his  duties 
included  system  designn   assigning  tasks 
to  project  membersn   and  co-ordinating 
efforts  of  individual   programmers-  He 
also  participated  in  coding  and  testing 
the  system-     In  the  first  phase  of 
development -I  he  produced  as  much  code  as 
any  other  project  member-     In  the  second 
phase-i   he  produced  far  less  code-  More 
time  was  spent  on  design  and  technical 
administration  of  the  project- 

The  backup  programmer  assists  the  chief 
programmer  and  shares  his  duties-     He  is 
familiar  with  the  entire  project  and  is 
able  to  assume  chief  programmer  responsi- 
bility at  any  t  ime  ■ 

The  project  secretary  maintains  a  master 
library  of  the  system  and  documentation 
during  development-     The  secretary 
incorporates  new  and  corrective  code  in 
the  systemn  keeps  a  record  of  all  changes 
made-i   and  maintains  listings  of  the  system- 
The  project  secretary  described  by  Baker 
and  Hills  also  was  responsibile  for 
making  and  keeping  a  record  of  all  test 
runs-     In  this  development-i  project 
members  took  turns  handling  the  secretar- 
ial  duties-     Only  the  most  significant 
test  runs  were  recorded  during  the 
development • 

System  subroutines  were  written  in  FORTRAN 
and  assembly  language-     Routines  critical 
to  performance  and  those  handling  packed 
data  were  written  in  assembly  language - 
Remaining  routines  were  written  in  FORTRAN- 
Extensive  use  of  FORTRAN  eased  the  prob- 
lems of  executing  under  different 
operating  systems-     Subroutines  could  be 
coded  and  tested  more  quickly  using 
FORTRAN  than  using  assembly  language- 
This  left  more  time  for  writing  perform- 
ance critical  routines- 
Documentation  standards  were  specified 
and  enforced-     At  the  start  of  the  pro- 


jecti  each  member  was  given  a  loose  leaf 
notebook  for   internal   documentation-  The 
notebook  was  added  to  as  subroutines  were 
defined  and  designed  and  updated  as 
routines  were  modified-     Internal  documen- 
tation consisted  of  subroutine  descrip- 
tions-i  communication  region  ICR}  cell 
descriptions!   and  array/file  formats- 
Subroutine  and  CR  cell  descriptions  were 
maintained  on  the  program  library-  This 
documentation  was  extracted  from  the 
library  via  computer  program-i  copied  and 
distributed  to  project  members  whenever 
changes  were  made- 
Subroutine  documentation  consisted  of 
comment  cards  within  each  routine-  Lie 
felt  documentation  was  more  likely  to  be 
kept  current  when  included  in  the  routine- 
Documentation  was  then  updated  at  the  same 
time  and  in  the  same  manner  as  the 
executable  code-     Two  levels  of  documenta- 
tion were  required  for  subroutines-  The 
first  level  describes  what  the  routine 
does  and  hou  it  is  called-     This  level  is 
written  when  a  necessary  operation  is 
defined-     The  second  level   is  a  step  by 
step  description  of  how  the  routine  per- 
forms its  function-     This  documentation 
is  the  result  of  subroutine  design  and 
is  equivalent  to  flow  charts-     Both  levels 
are  written  before  the  subroutine  is  coded- 

CR  cell  definitions  and  documentation  were 
maintained  as  a  data  file  on  the  system 
library-     Subroutine  COHHON  statements 
were  generated  from  this  file  by  a  com- 
puter program-     This  program  also  produced 
printed  documentation  of  the  CR  cells- 

A  minimal   set  of  coding  standards  were 
established-     The  primary  pupose  of  these 
standards  was  to  promote  simple-,  readable 
code-     The  standards  were  intentionally 
minimal   so  they  were  easily  followed- 
Standard  FORTRAN  calling  sequences  were 
used  for  all   subroutines-  Subroutine 
names  begin  with  a  (3  or   a  J   if  a  function 
subroutine-     CR  cells  were  named  to 
differentiate  them  from  local  variables 
and  to  describe  their  use-     The  first 
character  of  a  CR  cell  name  specifies 
whether   its  contents  is  alphan  integer-i 
or  real-     The  second  character  of  the 
name  specifies  its  use-     The  name  then 
defines  the  cell  to  be  a  name-i  switch-i 
index-i  counti   parametern  or  tolerance- 
Statement  numbers  in  FORTRAN  routines 
are  in  ascending  order-     Each  routine 
was  restricted  to  a  single  exit- 
Structured  programming!  top-down  pro- 
grammingn   and  open  programming  were 
considered  for  the  second  development 
phase  - 

Structured  programming  ■ClnEi3>  generally 
does  not  allow  GO  TO  statements-  Rather-i 
all   decisions  are  programmed  using  IF- 
THEN-ELSE   and  DO  constructs-     The  purpose 
is  to  provide  easily  readable  code  by 


avoiding  complicated  branches  of  control- 
The  IF-THEN-ELSE   constructs  serve  to 
define  all  conditions  under  which  a 
section  of   code   is  executed- 

One  routine  uas  written  using  structured 
programming  to  evaluate   its  usefulness  for 
this  development-     The  routine  performed  a 
complicated  vector  packing  operation- 
Structured  programming  rules  were  rigor- 
ously applied  for  the  investigation-  The 
technique  did  not  produce  the  desired 
results-     Rather-i  the  structure  had  to  be 
forced  upon  the  routine  and  detracted  from 
its  clarity  and  performance-     The  technique 
was  not  used  in  remaining  development- 
Howevern  the  idea  was  not  entirely  rejected' 
Reasonable  efforts  were  made  to  avoid 
unnecessary  backtracking  and  to  allow 
control  to  flow  from  top  to  bottom- 
Sections  of  code  which  could  be  entered  in 
special  ways  were  flagged  with  comment 
car ds - 

Top-down  programming  -[ln3>  is  a  technique 
to  organize  the  development  into  levels 
of  detail-     The  basic  inputs-i  outputs-i  and 
operation  of  the   system  are  first  defined- 
Next  a  first  level  of  detail  of  operation 
is  added  to  the  system-     The  process  is 
repeated  until   all   levels  of  detail  are 
defined-     Each  level  can  be  designed-i 
coded-i   and  tested  independently  of 
succeeding  levels-     The  technique  was  not 
used  in  this  development-     Ue  felt  that 
design  of   an  efficient  linear  programming 
system  consists  of  several  iterations 
a  top-down-i  bottom-up  sequence-  Perform- 
ance critical  routines  are  dependent  on 
details  of  method  and  file  and  array 
formats-     These  aspects  need  to  be  formu- 
lated quite  early  in  the  design  process- 
To  a  degree-i  these  details  determine  the 
form  of  higher  level  routines- 
Open  programming  il-i2>  is  a  technique  which 
was  used  in  this  development-     Uith  this 
technique-i  each  programmer's  design-i 
documentation!   and  code   is  reviewed  by 
another  project  member-     A  thorough  review 
should  result  in  improved  readability  of 
documentation  and  code-     The  review  should 
also  uncover  errors  that  may  be  missed 
during  system  testing-     A  thorough  review 
also  requires  a  considerable  amount  of 
effort-     In  this  development!   all  documen- 
tation and  approximately  half  the  code  was 
reviewed-     Each  programmer  first  had  design 
documentation  reviewed  by  another  program- 
mer-    After  review  and  coding  through  a 
clean  compile-i  code  was  similarly  reviewed- 
Additionally-i   all   design  documentation  was 
reviewed  by  either  the  chief  or  backup 
pr ogr  ammer - 

Software  verification  is  a  difficult  and 
costly  part  of  any  development-     It  is 
economically  infeasible-i   if  not  impossible 
to  find  all  errors  in  a  software  product- 
Ne vertheless-i  the  software  must  work 
correctly  nearly  all  the  time  to  be  useful 


and  profitable.     Nearly  all  software 
development  techniques  described  above 
contribute  to  software  reliability-  Three 
direct  methods  of  verification  were  used 
in  this  development- 

The  open  programming  review  was  the  first 
means  of  verification-     Secondlyn  several 
utility  routines  were  tested  independently 
of  the  system  in  a  simulated  environment- 

This  was  an  attempt  at  checking  out  all 
paths  through  critical  routines-  Thirdly-i 
the   integrated  system  uas  tested-  Several 
models  were  generated  to  test  special 
conditions-     The  bulk  of  the  testing-i 
howevern  relied  on  real  world  linear 
programming  models  for  system  checkout- 


CONCLUSIONS 


The  project  was  completed  on  schedule! 
costs  were  within  10'/.  of  budget  and  per- 
formance and  reliability  standards  were 
met  - 

Documentation  and  coding  standards  were 
considered  critical  to  success  of  the 
project-     Without  these  standards-i  it 
would  not  have  been  possible  to  integrate 
and  debug  the  system  on  schedule- 
Additionally-i   documentation  produced 
during  the  first  phase  of  development  con- 
siderably eased  the  second  phase  effort- 
Several  of  the  problems  that  were  encoun- 
tered were   in  areas  where  standards  were 
not  closely  followed- 

Open  programming  appeared  to  be  well 
worthwhile-     The  review  uncovered  bugs  and 
resulted  in  improved  documentation.  Ue 
feel  the  time  spent  on  the  review  was  more 
than  made  up  for  in  the  test  phase- 
Initial   debugging  indicated  reviewed  code 
had  approximately  1/3  the  errors  of  non- 
reviewed  code-     The  quality  of  review 
could  be   improved  if  all  done  by  the  chief 
or  backup  programmer-     These  two  individ- 
uals are  most  familiar  uith  the  entire 
system-     However-i  this  places  additional 
time  demands  on  two  critical  resources- 

Project  statistics  confirmed  the  theory 
that  considerable  effort  is  required  to 
produce  accurate  estimates-     The  project 
allowed  spending  a  portion  of  the 
original  budget  before  producing  final 
cost  estimates-     Table  1  compares  the 
accuracy  of  these  estimates  to  the  portionl 
of  work  complete  when  estimates  were  made-i 

Table  1 

Errors  in  Estimation  of  Hajor  Tasks 


Task 
A 
B 
C 
D 


Portion  Complete 
Before  Estimate 

IT/. 

IE-/. 

3-/. 

17'/. 


Error  in 
Estimate 

under  Etj/C 

nil 

under  flO^C 
over  Ib^ 
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The  error  in  estimate  for  task  A  was  caused 
by  a  change  of  scope  imposed  upon  the  pro- 
ject-    The  negligible  error  for  task  B 
shows  that  costs  can  be  controlledi  given 
a  reasonable  estimate-     The  error  for  Task 
C  resulted  because  insufficient  time  was 
spent  on  the  estimate-     Task  D  was  com- 
pleted with  less  effort  than  estimated 
because  a  simpler  implementation  was 
devised • 

Finally!   it  should  be  mentioned  that 
individual   programmers'   attitudes  are  key 
to  the  success  of  any  development-  Pro- 
jects are  completed  on  time  by  programmers 
who  are  determined  to  finish  on  time- 
duality  products  are  turned  out  by 
programmers  who  take  pride  in  the  quality 
of  their  work-     These  programmers  can  make 
effective  use  of  new  software  development 
techniques • 
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MATH  PROGRAMMING  USERS  VS  THE  COMPUTER  CENTER 
(A  personal  perspective  as  seen  from  a  foxhole) 

J.   R.  Ellison 
Mobil  Oil  Corp. 


The  fact  that  there  is  a  general  and  on- 
going conflict  between  the  users  and 
their  computer  center  is  generally,  and 
I  believe  correctly,   taken  for  granted. 
If  you  will  accept  this  as  a  fact  of  life, 
let's  analyze  why  this  conflict  exists. 

The  first,   and  I  believe  foremost,  reason 
is  that  each  views  the  other  as  the 
proverbial   "black  box"  and,   since  there 
is  little  personal  contact  with  or  real 
understanding  of  the  other's  problems  or 
motivations,   conflicts  arise. 

One  source  of  conflict  is  the  difference 
in  objectives.     The  primary  objective  of 
a  computer  center  is  often  stated  to  be 
to  "maximize"   throughput,   efficiency,  etc. 
This  can  be  measured  as  computer  center 
income    (charges)    vs  budget  prediction,  as 
how  well  job  schedules  are  met,   or  as 
wall  clock  hours  or  CPU  hours  per  day  the 
center  runs.     Other  popular  measures  are 
how  many  jobs  are  run  each  day  and  "per- 
cent CPU  utilization" . 

The  Math  Programming  user  usually  has  as 
a  primary  objective  the  provision  of  an 
answer  to  a  question  asked  by  some  level 
of  management  pertaining  to  the  economic 
consequences  of  taking  or  not  taking  some 
course  of  action.     This  question  includes 
defining  the  physical  courses  of  action 
which  go  with  the  economics.     In  pro- 
viding the  above  answer,   or  answers,  the 
Math  Programming  user  is   seldom  able  to 
schedule  his  work  since  questions  are 
addressed  to  him  and  "outside"   data  is 
provided  to  him  on  a  time  table  which  he 
cannot  control  and  the  answers  are  re- 
quired by  a  predetermined  date  (prefer- 
ably yesterday)   which  is  not  under  his 
control  and  is  often  not  under  the  con- 
trol of  the  person  asking  the  question. 
The  resulting  work  load  disrupts  the  com- 
puter center  schedules  with  a  sudden  on- 
slaught of  computer  load  as  measured  by 
CPU  time  without  a  corresponding  increase 
in  the  number  of  jobs  run. 

The  user's  habit  of  preferably  working 
only  days    (prime  shift)    compounds  the 
problem  since  he  wants  to  make  and 


correct  errors  on  the  same  day  so  he  can 
submit  more  work  that  night  or,  better 
yet,  get  more  prime  shift  runs  that  day. 
The  Math  Programming  user  is  also  "prob- 
lem"  oriented  and  looks  on  the  computer 
center  as  the  provider  of  a  tool  required, 
to  solve  "his"  problem  so  that  he  can 
provide  an  answer  to  a  complex  question. 
He  views  the  center  as  a  cost  and  as  an 
obstacle  to  be  overcome  when  it  does  not 
meet  his  requirements.     After  all,  com- 
puter centers  don't  make  money  -  only 
answers  make  money. 

Comparisons  of  the  above  statements  show 
areas  of  conflict  between  scheduled  and 
unscheduled  work,   specific  vs  broad  objec: 
tives ,   and  differences  in  objectives  and 
measurement  of  effectiveness. 

The  computer  center  viewpoint  can  often  b, 
characterized  as  a  responsibility  for 
maintaining  a  complex  operating  system  an 
hardware  which  must  respond  to  the  demand 
of  a  wide  variety  of  users,   none  of  whom 
understand  the  computer  center's  problems 
or  how  a  computer  actually  functions. 
They  also  appear  to  believe  that  Math 
Programming  users  are  unreasonable  people 
who  can  shut  down  or  stretch  out  job  pro- 
cessing for  other  users  (especially 
scheduled  users)    if  they  are  allowed  the 
amount  of  CPU  time  they  need.  Unfortu- 
nately,  these  viewpoints  often  have  a 
large  element  of  truth  in  them.  Stretch- 
ing out  of  job  processing  causes  schedulei 
jobs  to  miss  their  schedule  and  has  been 
known  to  make  accounting  type  users 
extremely  unhappy  with  the  computer  cents: 
This  is  important  to  the  computer  center 
because  accounting  types  are  much  more 
numerous,  are  very  adept  and  well  trains' 
at  writing  complaining  msmorandums ,  and 
ths  computer  centers  are  often  controlled 
by  accounting  types.     If  you  don't  like 
the  above,    just  substitute  technical  type 
or  whoever  has  been  causing  you  the  most 
trouble  and  it  will  be  just  as  applicable 


The  Math  Programmer's  point  of  view  is 
that  "we"  have  to  obtain  "optimal"  answer; 
to  questions  posed  by  management  and  thes«^j, 
answers  must  be  available  by  a  given 
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deadline.     Unfortunately,  we  must  expend 
that  limited  resource  known  as  time  to  ob- 
tain data  necessary  to  understand  and  cor- 
rectly model  the  problem  as  stated;  we 
must  spend  additional  time  removing  "bugs" 
from  the  model  and  revising  the  model  to 
fit  the  actual  question  as  opposed  to  what 
,we  thought  the  question  was  and,  finally, 
we  must  obtain  sufficient  time  from  the 
computer  center  to  allow  us  to  meet  our 
answer  deadline  and  thus  stay  out  of  trou- 
ble.    This  last  item  is  known  as  providing 
"reasonable"   turnaround  and,    since  it  is 
at  the  end  of  the  time  chain,   is  the 
source  of  much  agony  to  all  concerned. 


The  Math  Programming  use 
the  center  how  much  comp 
need,  simply  says  he  doe 
knew  how  much  time  he  ne 
runs,  he  would  know  the 
not  need  any.  While  the 
in  this,  most  computer  c 
predictable  and  well  sch 
he  has  just  told  them  he 


r,  when  asked  by 
uter  time  he  will 
sn ' t  know.     If  he 
eded  and  how  many 
answer  and  would 
re  is  much  truth 
enters  still  like 
eduled  runs  which 
doesn't  have. 


:^ow,  with  the  above  problems  and  conflicts 
3f  interest,   how  can  we  learn  to  co-exist 
rfith  a  computer  center  and  get  work  accom- 
plished without  giving  in  to  them.     I  be- 
lieve that  it  is  possible  by  defining  your 
general  requirements  in  terms  of  the 
following : 

■(1)   CPU  time  required  for  prime  shift  and 
total  shift. 

(2)  Job  turnaround.     Analyze  what  you  are 
asking  for  and  be  reasonable.  Turn- 
around isn't  as  simple  as  it  sounds. 

(3)  Storage  space  requirements.  Perma- 
nent storage  space  is  critical.  Disk 
space  is  almost  always  in  short 
supply,   expensive,   and  highly  desir- 
able.    Tape  is  plentiful,   cheap,  and 
less  desirable  since  it  must  be 
mounted.     System  "scratch"  space 

(temporary  space)    is  also  limited 
since  it  is  usually  disk.  Remember 
that  tapes  can  be  limited  by  the  num- 
ber of  tape  drives  available.  You 
must  be  realistic  with  your  space 
demands  if  you  expect  turnaround  since 
multiprogramming  means  that  there  are 
more  users  than  just  you  demanding 
space  at  any  one  time.     Extreme  re- 
quirements can  delay  your  jobs. 
Memory  requirements  -  memory  can  be 
expensive  and  limiting.     Be  reasonable 
and  careful  even  if  VS  (Virtual 
Systems)    is  around. 
(5)   Math  Programming  codes  -  you  should 
know  more  about  which  codes  are  best 
for  you  than  the  computer  center  does, 
but  be  prepared  to  prove  your  point  of 
view  and  to  justify  t'ne  costs. 

'Co  define  the  above  requirements  in  any- 
bhing  approaching  a  quantitative  manner, 
^ou  need  tools  which  will  allow  you  to 
iefine  where  you  are  and  where  you  were, 
rleports  based  on  the  Systems  Management 
function   (SMF)   or  its  equivalent  are  the 
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best  that  I  have  found.     Don't  rely  on 
opinions,   feelings,    "informed"  opinions, 
etc.     They  are  unreliable,   cannot  be  com- 
pared,  and  are  an  excellent  booby  trap  for 
the  unwary.     Careful  hand  tabulation  of 
data  from  your  runs  is  better  than  nothing 
but  very  time  consuming,   especially  if  you 
collect  enough  data  to  be  really  useful. 

Now  let's  touch  on  some  of  my  personal 
biases  and  prejudices    (I  prefer  to  call 
them  measurements)   which  I  swear  by  when 
they  agree  with  my  point  of  view  and  at 
when  they  disagree. 

(1)  User  turnaround  -  time  measured  from 
job  in  at  the  reader  to  finished  job 
output  available  to  the  user.  The 
user  correctly  loves  this  but  be 
careful.     Printout  priorities  and 
length  of  printout    (lines  and  printer 
speed)    can  influence  this  measure  and, 
since  this  isn't  truly  controlled  by 
the  center,    is  not  exactly  a  fair 
measure  of  computer  center  perfor- 
mance . 

(2)  Center  turnaround  -  time  measured 
from  job  in  at  the  reader  to  job 
finished  by  the  initiator  and  turned 
over  to  the  output  queue.     I  consider 
this  a  measure  of  the  center's  ability 
to  provide  service.     It  includes  time 
in  the  input  queue  plus  time  in  the 
computer  and  thus  measures  items 
"controlled"   by  the  center  and  is  a 
measure  of  their  ability  to  do  work. 

(3)  Execution  ratio  -  wall  clock  time 
from  initiator  start  to  initiator 
finished  divided  by  problem  program 
CPU  time.     This  is  an  indication  of 
the  load  on  the  CPU  and  can  be  used 
as  an  indication  of  whether  or  not 
you  are  getting  your  "fair"   share  of 
the  computer.     It  is  also  useful  in 
estimating  how  long  your  job  will  be 
in  the  computer. 

The  above  three  measurements  must  be  accu- 
mulated,  averaged  over  a  reasonable  time 
span,   and  then  plotted  so  that  you  can 
see  trends  or  changes  from  whatever 
"normal"   is  or  becomes. 

Additional  information  which  allows  you 
to  see  how  your  job  load  fits  into  the 
computer  center  and  its  schedule  are 
helpful.     Data  on  what  percent  of  the 
total  CPU  time,   charges,   or  jobs  you 
represent  can  be  useful  especially  if  it 
is  significantly  large.     You  should  also 
be  aware  of  any  special  services  the  cen- 
ter does,   or  doesn't,  perform  for  you. 

Once  you  have  all  of  this  information, 
you  should  meet  with  computer  center 
management  and  try  to  obtain  commitments  - 
and  agreements  on  levels  of  service  which 
they  will  provide  to  you.     This  written 
agreement  will  then  provide  a  basis  for 
co-existence  between  you. 
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OPERATIONAL  MANAGEMENT  OF  MATHEMATICAL 
PROGRAMMING  BASED  PLANNING  SYSTEMS — 


E.  G.  Kammerer 
UNION  CARBIDE 
CHEMICALS  AND  PLASTICS  DIVISION 
Operations  Planning  Department 


Chemicals  and  Plastics  productions  account  for  approximately  40  percent  of  Union 
Carbide's  sales  resulting  in  the  movement  of  19  billion  pounds  at  production  annually. 
This  activity  involves  more  than  6,000  products  manufactured  at  approximately  15  plants 
and  moved  through  more  than  100  bulk  terminals  and  warehouses.     We  deal  with  more  than 
30,000  industrial  customers  utilizing  six  deepwater  vessels,    150  inland  barges,   and  more 
than  8,000  tank  cars,   hopper  cars,   and  van  boxes. 


My  purpose  in  describing  Union  Carbide's  production/distribution  network  is  to 
highlight  the  need  for  the  use  of  mathematical  optimization  techniques  in  effective 
planning.     Obviously,    it  is  a  complex  business,   with  highly  intergrated  product/location 
characteristics  in  a  changing  business  environment. 


The  planning  system  used  to  help  manage  our  chemical  operation  generally  consists 
of  three  phases: 

1.  Gathering  and  processing  data  as  input  to  the  mathematical  optimization 
system. 

2.  The  solution  of  a  mathematical  model  to  develop  the  optimal  production  and 
distribution  plan  for  the  period. 

3.  Communication  of  the  results  of  the  planning  model  to  the  various  functions 
responsible  for  the  operation. 


(See  attached  flowchart  of  planning  cycle.) 


The  successful  management  of  a  planning  system  depends  on  all  three  of  these 
phases.     Any  one  without  the  other  two  is  useless.     I  will  discuss  each  of  the  phases 
with  particular  attention  to  the  needs  of  the  mathematical  model. 


The  first  phase  of  the  planning  system,   the  gathering  of  data,   can  be  broken  in 
two  separate  activities.     The  first  is  standard  data  or  that  information  which  does  not 
change  very  rapidly.     Normally,   a  yearly  review  and  update  of  this  is  sufficient.  The 
type  of  data  included  here  is  annually  budgeted  cost,   capacities,  production  factors,  etc 


The  other  class 
can  change  from  day  to 
production  limitations 
type  of  input. 


of  input  data  consis 
day.  Transportation 
and  strategies,  and 


ts  of  the  more  current 
cost  and  limitations, 
the  sales  forecast  are 


operating  data  which 
inventory  strategies, 
examples  of  this 


This  strategic  information  is  the  more  difficult  and  the  most  critical  of  the 
data  need.     It  is  gathered  through  direct  interface  with  various  operating  groups.  A 
good  understanding  of  their  functions  and  objectives  is  necessary  so  that  the  input 
can  be  interpreted.     Normally,   people  in  operations  are  not  knowledgeable  in  the 
functioning  of  mathematically  based  planning  systems.     Because  of  this,   all  meetings  to 
gather  this  input  data  must  be  handled  with  the  idea  of  getting  the  understanding  and 
cooperation  of  these  various  other  groups.     It  is  my  responsibility  to  interpret  their 
input  and  strategies  and  develop  the  proper  model  representation.     This  must  be  done 
without  putting  unnecessary  restrictions  in  the  model  which  will  prevent  true 
optimization . 


The  second  phase  of  this  process  is  the  solution  of  a  mathematical  model.  The 
particular  model  used  by  Union  Carbide's  Chemical  and  Olefin  Division  consists  of  a 
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linear  program  matrix  containing  approximately  12,000  variables  and  3,000  equations. 
Variables  represent  all  raw  materials,   intermediates,   and  finished  products.     A  number 
of  non-linear  plant  models  are  linked  to  this  linear  matrix.     The  resulting  model  is 
solved  using  a  recursive  technique  on  an  IBM  System/370  Model  168,  Mathematical  Pro- 
gramming System  Extended    (MPSX)    is  used  to  set  up  and  solve  the  linear  portion. 
Depending  on  the  number  of  iterations  required  to  reach  optimum,   the  computing  time 
required  is  normally  less  than  two  hours.     It  is  important  to  note  that  this  is  a 
batch  type  system.     The  problems  involved  with  running  a  batch  system  are  quite  dif- 
ferent than  an  on-line  system. 


In  running  this  model,   we  interface  with  two  separate  functions.  Computer 
Operations  has  responsibility  for  hardware  availability  and  priority.     Systems  Support 
is  responsible  for  supplying  programming,   system  maintenance,   or  software  support  if 
needed.     Problems  often  develop  in  obtaining  necessary  priority  due  to  the  size  of  the 
planning  system  and  the  keen  competition  for  limited  computer  resources.     Because  of 
this,   it  is  desirable  to  place  model  runs  on  production  status.     That  is,   runs  should 
be  planned  in  advance  and  the  support  commitment  of  Computer  Operations  obtained. 


The  demand  on  computer  resources  is  cyclic.     During  particular  times  of  the  month 
or  quarter,   the  demand  on  the  computer  is  particularly  heavy.     By  planning  your  major 
runs  during  low  demand  time  or  off-hours  such  as  nights  and  weekends,   it  is  possible  to 
improve  your  turnaround  time. 

Anyone  involved  in  programming  should  be  aware  of  the  importance  of  program 
efficiency.     The  CPU  time  required  by  a  job  affects  what  turnaround  it  will  get.  Jobs 
that  stay  in  the  computer  and  slow  down  the  network  are  not  run  during  high  demand  time. 
When  running  on  virtual  storage,   as  is  the  case  in  most  large  installations,   the  ef- 
ficiency of  the  program,   that  is,   the  ration  of  CPU  time  to  elapsed  time,   is  extremely 
important.     The  longer  a  job  stays  in  the  computer,   the  greater  the  chance  of  system 
problems  developing  and  the  premature  cancellation  of  the  job.     In  addition,  the 
operating  cost  of  running  the  system  will  be  reduced.     Various  techniques  can  and 
should  be  used  to  improve  program  efficiency  when  designing  a  batch  type  system  that 
will  be  operated  regularly. 

Another  important  consideration  is  to  insure  that  any  important  instruction  to 
the  operator,    such  as  mounting  tapes  or  writing  over  tapes,   should  be  given  as  close  to 
the  time  the  action  is  to  be  taken  as  possible.     The  manager  of  the  planning  activity 
has  little  control  over  the  terminal  operators.     In  addition,   the  terminal  operators 
are  involved  in  many  activities  at  the  same  time.     Therefore,   it  is  necessary  to  mini- 
mize the  opportunity  for  misunderstandings  that  may  result  in  premature  termination 
of  a  run.     In  large  systems,   this  can  be  extremely  costly  in  terms  of  cost  and  time. 


At  the  present  time,  we  are  operating  our  system  at  a  remote  terminal.  This 
arrangement  adds  to  the  difficulties  mentioned  above.     It  is  impossible  to  be  as 
familiar  with  changes  in  software  and  hardware  as  when  operating  directly  from  the 
computer  center.     In  the  past,   we  were  located  at  the  computer  center  and  a  better 
understanding  and  effective  use  of  the  computer  was  much  easier. 


When  developing  and  programming  a  system  such  as  the  one  I  have  described,  that 
is,  one  that  will  be  run  routinely  in  support  of  a  planning  or  operating  function,  it 
is  very  important  to  involve  the  system  analyst  when  deciding  such  elements  as  the 
number  of  variables,   procedures  for  updating,  maintenance,   etc.     It  seems  that  time 
and  time  again  we  are  in  a  situation  where  we  need  to  be  able  to  analyze  one  more 
variable  than  the  program  will  allow  and  the  limit  on  variables  was  purely  an 
arbitrary  decision  made  by  the  designer.     When  deciding  the  limit  of  program  capa- 
bility,  the  operating  analyst  is  in  a  better  position  to  determine  the  need  for  more 
or  less  flexibility. 

A  very  important  part  of  the  planning  system  is  the  report  writer.     While  the 
system  analyst  may  be  able  to  interpret  a  L.P.   matrix,   the  need  to  review  the  pre- 
liminary results  with  other  functions  exist.     This  highlights  the  need  for  reports 
that  can  be  used  to  review  the  solution  and  identify  needed  corrections.     These  reports 
must  be  readable  by  persons  not  familiar  with  computer  output.     The  design  of  these 
preliminary  reports  can  have  a  great  affect  on  the  ease  of  review  and  the  reasonableness 
of  the  final  plan. 

The  third  and  final  stage  of  the  total  planning  system  is  the  communication  of 
the  results.     As  I  stated  earlier,   all  elements  are  equally  important.     The  best  de- 
signed program  with  the  best  input  is  useless  if  nothing  is  done  with  the  results. 
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I  am  generalizing  when  I  say  this  but  it  is  my  personal  feeling  that  efforts  to  I 
use  direct  computer  output  as  the  means  of  communicating  the  results  to  the  various  • 
functional  groups  are  not  as  effective  as  interpretation  by  an  analyst.     We  do  use  com- 
puter output  when  detail  is  needed  for  specific  action.     But  I  do  not  feel  it  can 
interpret  and  communicate  the  results  of  such  a  complex  planning  system.     A  considerable 
effort  on  the  part  of  system  analysts  and  other  experienced  people  is  needed  to  interpret 
the  results  and  explain  all  of  these  significant  implications.     A  knowledge  of  the  pro- 
ducts and  the  distribution  network  is  as  important  as  computer  experience.  ^ 

Your  reaction  at  this  point  may  be  that  mathematical  programming  plays  no  part 
in  the  last  phase.     This  is  not  the  case.     The  understanding  and  cooperation  of  the 
programmer  is  essential  to  the  system  analyst  when  evaluating  the  L.P.    solution.  I 
have  found  that  the  more  significant  elements  of  the  solution  are  not  straightforward. 
Usually,   some  follow-up  case  studies  or  sensitivity  analysis  is  needed.     The  mathematical 
programmer  plays  an  important  role  during  these  studies.     His  advice  and  understanding 
of  the  program  logic  can  save  considerable  time  and  result  in  a  more  meaningful  plan. 


I  would  like  to  restate  my  position  on  the  final  stage.     It  is  my  opinion  that 
direct  output  from  the  computer  is  not  the  best  mode  of  communicating  the  optimal 
solution.     It  is  more  important  that  the  report  writer  capability  support  the  efforts 
of  a  systems  analyst  in  interpreting  the  results. 

Communication  of  the  plan  can  take  various  forms  depending  on  the  audience  and  ■ 

nature  of  the  activity  itself.     Normally,   greater  understanding  is  obtained  through  ■ 

graphs,   tables,   and  diagrams.     Prose  should  be  used  to  elaborate  the  results.     A  clear  I 

statement  of  all  assumptions  and  significant  limits  should  be  made.  I 

One  of  the  most  important  points  I  would  like  to  comm_unicate  is  the  type  of  result] 
generally  looked  for.     I  hope  that  an  understanding  of  this  will  help  the  mathematical  j 
programmer  to  do  a  better  job  in  using  program  logic.     The  most  important  part  of  the 
solution  is  the  direction  a  business  activity  should  take  and  not  the  exact  numbers 
contained  in  the  solution.     The  business  environment  changes  so  rapidly  that  it  is  nearly 
impossible  to  pin  down  exact  values.     A  good  understanding  of  the  direction  or  strategy 
that  should  be  applied  in  critical  areas  such  as  production,   inventory,   distribution  ,etc. 
is  essential.     Programming  efforts  should  be  done  with  greater  understanding  of  this  fact 


My  objective  during  this  presentation  has  been  to  give  you  more  insight  into  the 
problems  and  techniques  of  running  a  large  mathematically  programmed  planning  system.  I 
hope  that  this  knowledge  will  help  you  or  give  you  a  better  feel  for  some  of  the  critical 
elements  in  programs  supporting  large  operating  systems.     I'm  sure  that  a  slightly  dif- 
ferent viewpoint  would  be  presented  if  the  system  under  discussion  was  an  on-line 
system.     I  do  not  have  much  experience  with  on-line  systems  and  you  should  keep  in  mind 
that  all  of  my  discussion  is  based  on  batch  type  systems. 

I  support  your  efforts  in  trying  to  create  better  understanding  between  system 
designers  and  system  operators.     I'm  sure  that  all  of  us  will  be  able  to  do  a  better  job 
if  we  apply  some  of  the  experience  gained  through  understanding  the  other  person's 
problems . 
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MANAGING  A  LARGE  SCALE  PRODUCTION  AND  DISTRIBUTION  SCHEDULING  SYSTEM 


Kenneth  Goldfisher 
Nabisco,  Inc. 


The  Biscuit  Division  of  the  Nabisco  Corporation  presently  has  16  bakeries,  each 
capable  of  producing  some,  but  not  all,   of  approximately  750  different  products.  Ship- 
ping branches  perform  the  warehouse  and  distribution  functions  required  after  the  bakery 
has  produced  and  packaged  the  products.     They  supply  products  to  local  sales  branches, 
of  which  there  are  approximately  225  located  throughout  the  United  States. 

Estimates  of  sales  by  product,   over  a  12  week  horizon,   are  made  at  each  sales 
branch.     These  are  sent  by  telecommunication  equipment  to  a  central  location  where  they 
are  converted  into  production  requirements  which  are  then  allocated  to  individual  pro- 
duction facilities,    i.e.,   ovens,   incing  machines  and  packing  equipment.     The  resulting 
production  plan  specifies  the  amount  of  each  product  to  produce  on  each  facility  for  each 
sales  branch  during  each  of  the   four-week  periods  in  the  horizon.     The  production  plan 
minimizes  the  total  variable  cost  of  production  and  distribution  while  insuring  its 
production  feasibility. 

This  paper  concentrates  on  the  operational  aspects  of  the  system,   the  use  of  the 
computer  and  its  relationship  to  the  real  world,   both  from  the  field  unit  and  the 
division  management  point  of  view. 
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typical  plant  can  save  about  20  percent  of  Its 
jel — just  by  installing  waste  heat  recovery  equip- 
lent.  But  with  so  much  equipment  on  the  market, 
ow  do  you  decide  what's  right  for  you? 


In  addition  to  case  studies,  the  guidebook  contains 
information  on: 


ind  the  answers  to  your  problems  in  the  Waste 
'eat  Management  Guidebook,  a  new  handbook 
cm  the  Commerce  Department's  National  Bureau 
f  Standards  and  the  Federal  Energy  Administra- 
on. 

he  Waste  Heat  Management  Guidebook  is  de- 
igned to  help  you,  the  cost-conscious  engineer  or 
lanager,  learn  how  to  capture  and  recycle  heat 
lat  is  normally  lost  to  the  environment  during  in- 
ustrial  and  commercial  processes. 

he  heart  of  the  guidebook  is  14  case  studies  of 
ompanies  that  have  recently  installed  waste  heat 
acovery  systems  and  profited.  One  of  these  appli- 
ations  may  be  right  for  you,  but  even  if  it  doesn't 
t  exactly,  you'll  find  helpful  approaches  to  solving 
lany  waste  heat  recovery  problems. 


sources  and  uses  of  waste  heat 

determining  waste  heat  requirements 

economics  of  waste  heat  recovery 

commercial  options  in  waste  heat  recovery 

equipment 

Instrumentation 

engineering  data  for  waste  heat  recovery 
assistance  for  designing  and  installing  waste 
heat  systems 


To  order  your  copy  of  the  Waste  Heat  Management 
Guidebook,  send  $2.75  per  copy  (check  or  money 
order)  to  Superintendent  of  Documents,  U.S.  Gov- 
ernment Printing  Office,  Washington,  D.C.  20402. 
A  discount  of  25  percent  is  given  on  orders  of  100 
copies  or  more  mailed  to  one  address. 

The  Waste  Heat  Management  Guidebook  Is  part  of 
the  EPIC  industrial  energy  management  program 
aimed  at  helping  industry  and  commerce  adjust  to 
the  increased  cost  and  shortage  of  energy. 
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In  1954,  the  first  edition  of  CRYS- 
TAL DATA  (Determinative  Tables 
and  Systematic  Tables)  was  pub- 
lished as  Memoir  60  of  the  Geo- 
logical Society  of  America.  In  1960, 
the  second  edition  of  the  Determina- 
tive Tables  was  issued  as  Monograph 
5  of  the  American  Crystallographic 
Association,  and  in  1967,  the  Sys- 
tematic Tables  were  issued  as  Mono- 
graph 6.  These  editions  proved  ex- 
tremely valuable  to  crystallographers 
throughout  the  world.  Recognizing  the 
need  for  updated  crystallographic  in- 
formation, the  National  Bureau  of  Stand- 
ards Office  of  Standard  Reference  Data 
has  sponsored  the  issuance  of  a  new 
edition. 

This,  the  THIRD  EDITION,  should  be  of 
particular  interest  not  only  to  crystal-  ' 
lographers  but  also  to  chemists,  mineral- 
ogists,   physicists    and    individuals  in 
related  fields  of  study.  The  current  edition, 
which  comprises  two  volumes,  Organic  and 
Inorganic,  is  a  thoroughly  revised  and  up- 
dated work,  containing  over  25,000  entries. 

The  entries  are  listed,  within  each  crystal  sys- 
tem, according  to  increasing  values  of  a 
determinative  number:  a/b  ratio  in  trimetric 
systems,  c/a  ratio  in  dimetric  systems,  and 
cubic  cell  edge  a,  in  the  isometric  system.  In 
addition,  the  following  information  is  given: 


INORGANIC  VOLUME  $50.00 
ORGANIC  VOLUME  $30.00 


axial  ratio(s)  and  interaxial  angles 
not  fixed  by  symmetry,  cell  dimen- 
sions, space  group  or  diffraction 
aspect,   number  of  formula  units 
per   unit   cell,   crystal  structure, 
(whether  determined),  measured 
density  and  x-ray  calculated  den- 
sity. Also  listed  is  the  name  of  the 
compound    and  synonym(s), 
chemical  formula,  literature  ref- 
erence    and  transformation 
matrix.  When  available,  the  crys- 
tal structure  type,  crystal  habit, 
iCleavages,  twinning,  color,  optical 
properties,  indices  of  refraction, 
optical  orientation,  melting  point 
and   transition    point   are  also 
listed. 

THIS  EDITION  culminates  years  of 
effort  by  J.  D.  H.  Donnay,  Johns 
Hopkins  University,  Helen  M.  Ondik, 
National  Bureau  of  Standards,  Sten 
i   Samson,   California  Institute 
i     Technology,    Quintin  Johnso 
I  Lawrence    Radiation  Laboratory 
"Melvin  H.  Mueller,  Argonne  National 
Laboratory,  Gerard  M.  Wolten,  Aero- 
space Corporation,  Mary  E.  Mrose, 
U.S.  Geological  Survey,  Olga  Ken- 
nard  and  David  G.  Watson,  Cam- 
bridge University,   England  and 
Murray  Vernon  King,  Massachu- 
setts General  Hospital. 
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5.75  foreign. 
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latical  Sciences." 
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j  batement,  health  and  safety,  and  consumer  product  per- 
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National  Standard  Reference  Data  Series — Provides  quanti- 
ative  data  on  the  physical  and  chemical  properties  of 
naterials,  compiled  from  the  world's  literature  and  critically 
"valuated.  Developed  under  a  world-wide  program  co- 
ordinated by  NBS.  Program  under  authority  of  National 
•jStandard  Data  Act  (Public  Law  90-396). 


NOTE:  At  present  the  principal  publication  outlet  for  these 
data  is  the  Journal  of  Physical  and  Chemical  Reference 
Data  (JPCRD)  published  quarterly  for  NBS  by  the  Ameri- 
can Chemical  Society  (ACS)  and  the  American  Institute  of 
Physics  (AIP).  Subscriptions,  reprints,  and  supplements 
available  from  ACS,  1155  Sixteenth  St.  N.W.,  Wash.,  D.C. 
20056. 

Building  Science  Series — Disseminates  technical  information 
developed  at  the  Bureau  on  building  materials,  components, 
systems,  and  whole  structures.  The  series  presents  research 
results,  test  methods,  and  performance  criteria  related  to  the 
structural  and  environmental  functions  and  the  durability 
and  safety  characteristics  of  building  elements  and  systems. 

Technical  Notes — Studies  or  reports  which  are  complete  in 
themselves  but  restrictive  in  their  treatment  of  a  subject. 
Analogous  to  monographs  but  not  so  comprehensive  in 
scope  or  definitive  in  treatment  of  the  subject  area.  Often 
serve  as  a  vehicle  for  final  reports  of  work  performed  at 
NBS  under  the  sponsorship  of  other  government  agencies. 
Voluntary  Product  Standards — Developed  under  procedures 
published  by  the  Department  of  Commerce  in  Part  10, 
Title  15,  of  the  Code  of  Federal  Regulations.  The  purpose 
of  the  standards  is  to  establish  nationally  recognized  require- 
ments for  products,  and  to  provide  all  concerned  interests 
with  a  basis  for  common  understanding  of  the  characteristics 
of  the  products.  NBS  administers  this  program  as  a  supple- 
ment to  the  activities  of  the  private  sector  standardizing 
organizations. 

Consumer  Information  Series — Practical  information,  based 
on  NBS  research  and  experience,  covering  areas  of  interest 
to  the  consumer.  Easily  understandable  language  and 
illustrations  provide  useful  background  knowledge  for  shop- 
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Order  above  NBS  publications  from:  Superintendent  of 
Documents,  Government  Printing  Office,  Washington,  D.C. 
20402. 

Order  following  NBS  publications — NBSIR's  and  FIPS  from 
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Federal  Information  Processing  Standards  Publications 
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Register  serves  as  the  official  source  of  information  in  the 
Federal  Government  regarding  standards  issued  by  NBS 
pursuant  to  the  Federal  Property  and  Administrative  Serv- 
ices Act  of  1949  as  amended.  Public  Law  89-306  (79  Stat. 
1127),  and  as  implemented  by  Executive  Order  11717 
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CFR  (Code  of  Federal  Regulations). 

NBS  Interagency  Reports  (NBSIR) — A  special  series  of 
interim  or  final  reports  on  work  performed  by  NBS  for 
outside  sponsors  (both  government  and  non-government). 
In  general,  initial  distribution  is  handled  by  the  sponsor; 
public  distribution  is  by  the  National  Technical  Information 
Services  (Springfield,  Va.  22161)  in  paper  copy  or  microfiche 
form. 
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